Pierre Del Moral 



Feynman-Kac 

Formulae 

Genealogical and Interacting 
Particle Systems with 
Applications 



I Springer 



Probability and its Applications 

A Series of the Applied Probability Trust 
Editors: J. Gani, C.C. Heyde, T.G. Kurtz 



Springer 

New York 

Berlin 

Heidelberg 

Hong Kong 

London 

Milan 

Paris 

Tokyo 




Probability and its Applications 



Anderson: Continuous-Time Markov Chains. 

Azencott/Dacunha-Castelle: Series of Irregular Observations. 

Bass: Diffusions and Elliptic Operators. 

Bass: Probabilistic Techniques in Analysis. 

Choi: ARMA Model Identification. 

Daley/VereJones: An Introduction to the Theory of Point Processes. 

Volume 1: Elementary Theory and Methods, Second Edition. 
de la Peha/Gine: Decoupling: From Dependence to Independence. 

Del Moral: Feynman-Kac Formula: Genealogical and Interacting Particle Systems 
with Applications 

Durrett: Probability Models for DNA Sequence Evolution. 

Galambos/Simonelli: Bonferroni-type Inequalities with Applications. 

Gani (Editor): The Craft of Probabilistic Modelling. 

Grandell: Aspects of Risk Theory. 

Gut: Stopped Random Walks. 

Guyon: Random Fields on a Network. 

Kallenberg: Foundations of Modem Probability, Second Edition. 

Last/Brandt: Marked Point Processes on the Real Line. 

Leadbetter/Lindgren/Rootzen: Extremes and Related Properties of Random Sequences 
and Processes. 

Nualart: The Malliavin Calculus and Related Topics. 

Rachev/Ruschendorf: Mass Transportation Problems. Volume I: Theory. 
Rachev/Ruschendorf: Mass Transportation Problems. Volume II: Applications. 
Resnick: Extreme Values, Regular Variation and Point Processes. 

Shedler: Regeneration and Networks of Queues. 

Silvestrov: Limit Theorems for Randomly Stopped Stochastic Processes 
Thorisson: Coupling, Stationarity, and Regeneration. 

Todorovic: An Introduction to Stochastic Processes and Their Applications. 




Pierre Del Moral 



Feynman-Kac Formulae 

Genealogical and Interacting 
Particle Systems with Applications 



With IS Illustrations 




Pierre Del Moral 

Laboratoire de Statistique et Probabilit6s 
University Paul Sabatier 
1 1 8, Route de Narbonne 
3 1 062 Toulouse, Cedex 4 
France 

delmoral@cict.fr 



Series Editors 
J. Gani 

Stochastic Analysis 
Group, CMA 
Australian National 
University 
Canberra, ACT 0200 
Australia 



C.C. Heyde 
Stochastic Analysis 
Group, CMA 
Australian National 
University 
Canberra, ACT 0200 
Australia 



T.G. Kurtz 
Department of 
Mathematics 
University of Wisconsin 
480 Lincoln Drive 
Madison, WI 53706 
USA 



Library of Congress Cataloging-in-Publication Data 
Del Moral, Pierre. 

Fcynman-Kac formulae : genealogical and interacting particle systems with applications / 

Pierre Del Moral. 

p. cm. — (Probability and its applications) 

Includes bibliographical references and index. 

1. Path integrals. 2. Evolution equations. 3. Quantum theory. 4. Vector spaces. 1. Title. 

II. Springer series in statistics. Probability and its applications 
QC174.17.P27D45 2004 

530.12— dc22 2003063340 

Printed on acid-fiee paper. 

ISBN 978- 1 -44 1 9- 1 902- 1 ISBN 978- 1 -4684-9393- 1 (eBook) 

DOI 10.1007/978-1-4684-9393-1 

© 2004 Springer-Verlag New York, LLC 

Softcover reprint of the hardcover 1st edition 2004 

All rights reserved. This work may not be translated or copied in whole or in part without the written 
permission of the publisher (Springer-Verlag New York, hic., 175 Fifth Avenue, New Yoik, NY 
10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in 
connection with any form of information storage and retrieval, electronic ad^tation, computer 
software, or by similar or dissimilar methodology now known or hereafter develop^ is forbidden. 
The use in this publication of trade names, trademarks, service marks, and similar terms, even if 
they are not identified as such, is not to be taken as an expression of opinion as to whether or not 
they are subject to proprietary rights. 



987654321 SPIN 10942918 

Springer-Verlag is a part of Springer Science-^ Business Media 
springeronline.com 




To Laurence, Tiffany and Timothce 




Preface 



The central theme of this book concerns Feynman-Kac path distributions, 
interacting particle systems, and genealogical tree based models. This re- 
cent theory has been stimulated from different directions including biology, 
physics, probability, and statistics, as well as from many branches in engi- 
neering science, such as signal processing, telecommunications, and network 
analysis. Over the last decade, this subject has matured in ways that make 
it more complete and beautiful to learn and to use. The objective of this 
book is to provide a detailed and self-contained discussion on these connec- 
tions and the different aspects of this subject. Although particle methods 
and Feynman-Kac models owe their origins to physics and statistical me- 
chanics, particularly to the kinetic theory of fluid and gases, this book 
can be read without any speciflc knowledge in these flelds. I have tried to 
make this book accessible for senior imdergraduate students having some 
familiarity with the theory of stochastic processes to advanced postgradu- 
ate students as well as researchers and engineers in mathematics, statistics, 
physics, biology and engineering. 

I have also tried to give an “expose” of the modem mathematical theory 
that is useful for the analysis of the asymptotic behavior of Feynman-Kac 
and particle models. Researchers and applied mathematicians will find a 
collection of modern techniques from various branches of probability and 
stochastic analysis, including convergence of empirical processes, fluctuar 
tion and large-deviation analysis, semigroup and martingale techniques and 
propagation of chaos, as well as the asymptotic stability and the concen- 
tration of measme-valued processes, functional inequahties, ergodic coeffi- 
cients, and contractions of Markov operators and nonlinear semigroups. 




viii Preface 



Besides the mathematical analysis of Feynman-Kac distribution flows 
and interacting particle models, I have developed a rather large class of 
applications to specific models from various scientific disciplines. The prac- 
titioner will find a source of useful convergence estimates as well as a de- 
tailed list of concrete examples of particle approximations for real models, 
including restricted Markov chain simulations, random motions in absorb- 
ing media, spectral analysis of Schrodinger operators and Feynman-Kac 
semigroups, rare event analysis, Dirichlet boundary problems, nonlinear 
filtering problems, interacting Kalman-Bucy filters, directed polymer sim- 
ulations, and interacting Metropolis type algorithms. While this diversity 
of application model areas is part of the charm of the particle theory of 
Feynman-Kac models the list topics above is not exhaustive and actually 
only reflects the tastes and interests of the author. 

One objective in writing this book was to throw some new light on some 
interesting links between sometimes too disconnected physical, engineer- 
ing, and mathematical domains. In this connection, I would like to thanks 
Springer- Verlag and the editorial board for the invitation to write a book on 
this theme. I undertook this project for two main reasons. First I felt that 
there was no accessible treatment on Feynman-Kac path models and their 
interacting particle approximation schemes. Second, the abstract concepts 
and the probability theory are now at a point where they provide a natural 
and unifying mathematical basis for a large class of heuristic-like Monte 
Carlo algorithms currently used in Bayesian statistics, engineering science, 
and in physics and biology since the beginning of 1950s. I also hope that 
practitioners as well as graduate students from Bayesian schools will find 
a great advantage in using the abstract Feynman-Kac and particle theory 
developed in this book, provided they overcome the fear of seeing integral 
operators rather than summations or integrals with respect to some density 
function. Besides its mathematical elegance, the abstract formulation is less 
“notationtdly consuming” and of great practical value. It gives a powerful 
applied tool to be used in modeling nonlinear estimation problems as well 
as in studying and developing interacting particle approximation models. 
I hope that these ideas will fruitfully serve the further development of the 
field and that their propagation will influence other new application areas. 

The material in this book can serve as a basis for different types of 
advanced courses on probability. The first type, geared towards pure ap- 
plications of particle methods, could be centered around the Feynman-Kac 
modeling techniques and their application model areas discussed in Chap- 
ters 2, 3, 11 and 12. To aid more detailed studies these lectures could 
be completed either with the presentation of one of the more appHcation- 
related articles selected from the list of references or with a new application 
model. More theoretical types of courses could cover the material in Chap- 
ters 4 through 10. A semester-long course would cover the stability and the 
annealed properties of Feynman-Kac semigroups derived in Chapters 4, 5, 
and 6 (possibly excluding Chapter 5). There is also enough materieil in the 




Preface ix 



book to support three other sequences of courses. These lectures would 
cover, respectively, propagation of chaos (Chapters 7 and 8), central limit 
theorems (Chapters 7 and 9) 8md large-deviation principles (Chapter 10). 

A part of the material presented in this book is based on a series of lec- 
tures I delivered at the 24th Finnish Summer School on Probability Theory 
in Lahti in spring 2002 and that was arranged by the Finnish Graduate 
School in Stochastics and the Rolf Nevanlinna Institute. It is also partly 
based on a second-year graduate course on particle methods and nonlinear 
filtering I gave at the Operations Research and Financial Engineering de- 
partment of Princeton University in fall 2001. An overview was presented 
in three one-hour lectures for the Symposiiun on Numerical Stochastics 
(April 1999) at the Fields Institute for Research in Mathematical Sciences 
(Toronto) and at the same time was presented at the University of Alberta, 
Edmonton, with the support of the Canadian Mathematics of Information 
Technology and Complex Systems project. 

Some of the material developed in this book results from various fruit- 
ful collaborations with Frederic C6rou, Dan Crisan, Donald Dawson, Ar- 
naud Doucet, Francois Le Gland, Michael Kouritzin, Pascal Lezaud, Michel 
Ledoux, Terry Lyons, Philip Protter, Samuel Tindel, Frederi Viens, Tim 
Zajic, and peuticularly with Alice Guionnet, Jean Jacod, and Laurent Mi- 
clo. Most of the text also proposes many new contributions to the sub- 
ject. The reader will find a series of new deeper studies of topics such 
as contraction properties of nonlinear semigroups, functional entropy in- 
equalities, uniform and precise increasing propagation-of-chaos estimates, 
central limit and Berry-Esseen t)rpe theorems, large-deviation principles for 
strong topologies on path-distribution spaces, new branching and genealog- 
ical particle models, advanced Feymnan-Kac modeling techniques, and a 
fairly new class of application modeling areas. 

While continuous time models and their applications in physics and engi- 
neering sciences are discussed in Chapters 1 and 12, 1 have not hesitated to 
concentrate the exposition on discrete Feynman-Kac and particle models. 
The reasons are twofold: 

First, the analysis of discrete time models only requires a small prereq- 
uisite on Markov chains, while the study of continuous time models would 
have required different and specific knowledge on stochastic anedysis, par- 
ticularly on interacting jump models and Markov processes taking values 
in path spaces. Since one of the objectives was to prepare a text that was 
as self-contained as possible, a full presentation of both classes of models 
would have been too much digression. 

The second reason is that aptut from some mathematical technicalities 
the asymptotic analysis of continuous time models is in some sense more 
sophisticated but generally follows the stone intuitions and the same line of 
argument as in the discrete time case. On the other hand, to my knowledge, 
various convergence theorems such as the uniform and increasing propaga- 
tion of chaos, the Berry-Esseen estimates, and the strong large-deviation 




X Preface 



principles, presented respectively in Chapters 8, 9, and 10, remain nowa- 
days open problems for continuous time models. 

The reader interested in continuous time models can complete the study 
of this book following the four articles [96, 97, 98, 102]. For an introduc- 
tion to interacting particle interpretation of continuous time Feynman-Kac 
models we recommend the review article on genetic type models [96] as 
well as [98] and [97, 102]. The last two referenced articles discuss both 
discrete and continuous time with applications to Schodinger generators, 
filtering problems and fixed points of integro-difiPerential equations. I rec- 
ommend [102] as a start for someone who has never studied the subject 
before. The article [97] provides a series of advanced lectures on simple 
central limit theorems and exponential estimates. Strong propagation-of- 
chaos results using coupling and semigroup techniques can be found in [95]. 
The series of articles above can also be completed by some studies on par- 
ticle approximation of stochastic Feynman-Kac flows [63, 66, 64, 65] and 
their applications in the numerical solution of stochastic partial differential 
equations with non-linear potential [296]. I hope the former volume and the 
list of contributions to continuous time models above will guide the reader 
to understand the current state of the tirt on the topic and contribute to 
many interesting open problems. 

I end this preface with a few words of advice to readers who are anxious 
about spending too much time on unnecessary imm ersions. The introduc- 
tion of this book is an important step in entering into Fe)rnman-Kac mod- 
eling and interacting-particle methods. The applications described in this 
opening section as well as in Chapters 11 and 12 should help the reader 
to find a concrete basis for going further through the mathematical as- 
pects of this theory. A complete description on how the theory is applied 
in each application model area would of course require separate voliunes 
with precise computer simulations and comparisons with different types 
of particle models and other existing algorithms. I have chosen to treat 
each subject in a rather short but self-contained way. Some applications 
are nowadays routine, and in this case I provide precise pointers to ex- 
isting more application-related articles in the literature. Most applications 
also provide new insight on the theoretical and potential applications of 
Feynman-Kac and particle methods to statistical physics and engineering 
science. In this case the programming of these new particle {dgorithms is 
left to the reader. One natural path of “easy reading” will probably be to 
choose a familiar or attractive application area and to explore some selected 
parts of the book in terms of this choice. Nevertheless, this advice must not 
be taken too literally. To see the impact of genealogical tree-based particle 
methods, it is essential to understand the full force of Feynman-Kac mod- 
eling techniques on various research domains. Upon doing so, the reader 
will have a powerful weapon for the discovery of new Feynman-Kac inter- 
pretations and related particle numerical models. The principal challenge 




Preface xi 



is to understand the theory and the branching particle models well enough 
to reduce them to practice. 

I did not try to avoid repetition and each chapter starts with an intro- 
duction connecting the results developed in earlier parts with the current 
analysis. With a few exceptions, this book is self-contmned and each chapter 
can be read independently of the other ones. To get up to speed on Chap- 
ter 12 on applications, the reader is recommended to start with Chapters 2 
and 3, which contain the main concepts on Feynman-Kac modeling and in- 
teracting processes. Chapter 11 should also not be skipped since it contains 
a series of recipes on particle models to be combined with one another and 
applied in each application model area. In general, I did not give references 
in the text but in the introduction at the beginning of each chapter. I al- 
ready apologize for possible errors or for references that have been omitted 
due to the lack of accurate information. 

Finally, I would like to express my gratitude to the Centre National de 
la Recherche Scientifique (CNRS) which gave me the freedom and the op>- 
portunity to undertake this project and the Universite Paul Sabatier of 
Toulouse. I am also grateful to the University of Melbourne and Purdue 
University as well as to Princeton University, where part of this project was 
developed. Last but not least, I would like to extend my thanks to John 
Kimmel for his precious editorial assistance as well as for his encourage- 
ments during these last two years. 



Toulouse, France 
September, 2003 



Pierre Del Moral 




Contents 



1 Introduction 1 

1.1 On the Origins of Feynman-Kac and Particle Models .... 1 

1.2 Notation and Conventions 7 

1.3 Feynman-Kac Path Models 11 

1.3.1 Path-Space and Marginal Models 11 

1.3.2 Nonlinear Equations 13 

1.4 Motivating Examples l4 

1.4.1 Engineering Science 14 

1.4.2 Bayesian Methodology 21 

1.4.3 Particle and Statistical Physics 22 

1.4.4 Biology 25 

1.4.5 Applied Probability and Statistics 28 

1.5 Interacting Particle Systems 29 

1.5.1 Discrete Time Models 30 

1.5.2 Continuous Time Models 34 

1.6 Sequential Monte Carlo Methodology 37 

1.7 Particle Interpretations 39 

1.8 A Contents Guide for the Reader 41 

2 Feynman-Kac Formulae 47 

2.1 Introduction 47 

2.2 An Introduction to Markov Chains 48 

2.2.1 Canonical Probability Spaces 49 

2.2.2 Path-Space Markov Models 51 




xiv Contents 



2.2.3 Stopped Markov chains 52 

2.2.4 Examples 55 

2.3 Description of the Models 58 

2.4 Structural Stability Properties 61 

2.4.1 Path Space and Marginal Models 62 

2.4.2 Change of Reference Probability Measures 63 

2.4.3 Updated and Prediction Flow Models 65 

2.5 Distribution Flows Models 68 

2.5.1 Killing Interpretation 71 

2.5.2 Interacting Process Interpretation 73 

2.5.3 McKean Models 76 

2.5.4 Kalman-Bucy filters 79 

2.6 Feynman-Kac Models in Random Media 81 

2.6.1 Quenched and Annealed Feynman-Kac Flows .... 83 

2.6.2 Feynman-Kac Models in Distribution Space 85 

2.7 Feynman-Kac Semigroups 87 

2.7.1 Prediction Semigroups 88 

2.7.2 Updated Semigroups 91 

3 Genealogical and Interacting Particle Models 95 

3.1 Introduction 95 

3.2 Interacting Particle Interpretations 96 

3.3 Particle models with Degenerate Potential 99 

3.4 Historical and Genealogical T>ee Models 103 

3.4.1 Introduction 103 

3.4.2 A Rigorous Approach 

and Related IVansport Problems 105 

3.4.3 Complete Genealogical IVee Models 108 

3.5 Particle Approximation Measures 109 

3.5.1 Some Convergence Results 112 

3.5.2 Regularity Conditions 115 

4 Stability of Feynman-Kac Semigroups 121 

4.1 Introduction 121 

4.2 Contraction Properties of Markov Kernels 122 

4.2.1 h-relative Entropy 122 

4.2.2 Lipschitz Contractions 127 

4.3 Contraction Properties of Feynman-Kac Semigroups .... 132 

4.3.1 Functional Entropy Inequalities 134 

4.3.2 Contraction CoefiScients 138 

4.3.3 Strong Contraction Estimates 142 

4.3.4 Weak Regularity Properties 144 

4.4 Updated Feynman-Kac Models 146 

4.5 A Class of Stochastic Semigroups 152 




Contents xv 



5 Invariant Measures and Related Topics 157 

5.1 Introduction 157 

5.2 Existence and Uniqueness 160 

5.3 Invariant Measures and Feynman-Kac Modeling 161 

5.4 Feynman-Kac and Metropolis-Hastings Models 164 

5.5 Feynman-Kac-Metropolis Models 166 

5.5.1 Introduction 166 

5.5.2 The Genealogical Metropolis Particle Model 170 

5.5.3 Path Space Models and Restricted Markov Chains . 172 

5.5.4 Stability Properties 179 

6 Annealing Properties 187 

6.1 Introduction 187 

6.2 Feynman-Kac-Metropolis Models 189 

6.2.1 Description of the Model 189 

6.2.2 Regularity Properties 191 

6.2.3 Asymptotic Behavior 193 

6.3 Feynman-Kac IVapping Models 197 

6.3.1 Description of the Model 197 

6.3.2 Regularity Properties 198 

6.3.3 Asymptotic Behavior 201 

6.3.4 Large-Deviation Analysis 204 

6.3.5 Concentration Levels 208 

7 Asymptotic Behavior 215 

7.1 Introduction 215 

7.2 Some Preliminaries 217 

7.2.1 McKean Interpretations 218 

7.2.2 Vanishing Potentials 219 

7.3 Inequalities for Independent Random Variables 221 

7.3.1 Lp and Exponential Inequalities 222 

7.3.2 Empirical Processes 227 

7.4 Strong Law of Large Numbers 231 

7.4.1 Extinction Probabilities 231 

7.4.2 Convergence of Empirical Processes 236 

7.4.3 Time-Uniform Estimates 244 

8 Propagation of Chaos 253 

8.1 Introduction 253 

8.2 Some Preliminaries 255 

8.3 Outline of Results 258 

8.4 Weak Propagation of Chaos 261 

8.5 Relative Entropy Estimates 262 

8.6 A Combinatorial Transport Equation 267 

8.7 Asymptotic Properties of Boltzmann-Gibbs Distributions . 271 




xvi Contents 



8.8 Feynman-Kac Semigroups 277 

8.8.1 Marginal Models 278 

8.8.2 Path-Space Models 280 

8.9 Total Variation Estimates 282 

9 Central Limit Theorems 291 

9.1 Introduction 291 

9.2 Some Preliminaries 293 

9.3 Some Local Fluctuation Results 295 

9.4 Particle Density Profiles 300 

9.4.1 Unnormalized Measures 300 

9.4.2 Normalized Measures 301 

9.4.3 Killing Interpretations and Related Comparisons . . 303 

9.5 A Berry-Esseen Type Theorem 306 

9.6 A Donsker TVpe Theorem 318 

9.7 Path-Space Models 322 

9.8 Covariance Functions 327 

10 Large-Deviation Principles 331 

10.1 Introduction 333 

10.2 Some Preliminary Results 339 

10.2.1 Topological Properties 339 

10.2.2 Idempotent Analysis 340 

10.2.3 Some Regularity Properties 344 

10.3 Cramer’s Method 347 

10.4 Laplace- Varadhan’s Integral Techniques 351 

10.5 Dawson-Gartner Projective Limits Techniques 359 

10.6 Sanov’s Theorem 363 

10.6.1 Introduction 363 

10.6.2 Topological Preliminaries 364 

10.6.3 Stmov’s Theorem in the r-Topology 370 

10.7 Path-Space and Interacting Particle Models 374 

10.7.1 Proof of Theorem 10.1.1 374 

10.7.2 Sufficient Conditions 376 

10.8 Particle Density Profile Models 377 

10.8.1 Introduction 377 

10.8.2 Strong Large-Deviation Principles 379 

11 Feynman-Kac and Interacting Particle Recipes 387 

11.1 Introduction 387 

11.2 Interacting Metropolis Models 389 

11.2.1 Introduction 389 

11.2.2 Feynman-Kac-Metropolis and Particle Models .... 390 

11.2.3 Interacting Metropolis and Gibbs Samplers 393 

11.3 An Overview of some General Principles 394 




Ck>ntents xvii 

11.4 Descendant and Ancestral Genealogies 396 

11.5 Conditional Explorations 400 

11.6 State-Space Enlargements and Path-Particle Models .... 402 

11.7 Conditional Excursion Particle Models 404 

11.8 Branching Selection Variants 405 

11.8.1 Introduction 405 

11.8.2 Description of the Models 408 

11.8.3 Some Branching Selection Rules 409 

11.8.4 Some L 2 -mean Error Estimates 411 

11.8.5 Long Time Behavior 417 

11.8.6 Conditional Branching Models 419 

11.9 Exercises 420 

12 Applications 427 

12.1 Introduction 427 

12.2 Random Excursion Models 429 

12.2.1 Introduction 429 

12.2.2 Dirichlet Problems with Boundary Conditions .... 431 

12.2.3 Multilevel Feymnan-Kac Formulae 436 

12.2.4 Dirichlet Problems with Hard Boundary Conditions 440 

12.2.5 Rare Event Analysis 444 

12.2.6 Asymptotic Particle Analysis of Rare Events .... 447 

12.2.7 Fluctuation Results and Some Comparisons 450 

12.2.8 Exercises 453 

12.3 Change of Reference Measures 459 

12.3.1 Introduction 459 

12.3.2 Importance Sampling 460 

12.3.3 Sequential Analysis of Probability Ratio Tests .... 462 

12.3.4 A Multisplitting Particle Approach 463 

12.3.5 Exercises 465 

12.4 Spectral AnsJysis of Feynman-Kac-Schrodinger Semigroups 469 

12.4.1 Lyapunov Exponents and Spectral Radii 470 

12.4.2 Feynman-Kac Asymptotic Models 471 

12.4.3 Particle Lyapimov Exponents 473 

12.4.4 Hard, Soft and Repulsive Obstacles 475 

12.4.5 Related Spectral Quantities 477 

12.4.6 Exercises 479 

12.5 Directed Polymers Simulation 484 

12.5.1 Feymnan-Kac and Boltzmann-Gibbs Models 484 

12.5.2 Evolutionary Particle Simulation Methods 487 

12.5.3 Repulsive Interaction and Self-Avoiding 

Markov Chains 488 

12.5.4 Attractive Interaction and Reinforced Markov Chains 490 

12.5.5 Particle Pol)Tnerization Techniques 490 

12.5.6 Exercises 495 




xviii Contents 



12.6 Filtering/Smoothing and Path estimation 497 

12.6.1 Introduction 497 

12.6.2 Motivating Examples 500 

12.6.3 Feynman-Kac Representations 505 

12.6.4 Stability Properties of the Filtering Equations . . . 508 

12.6.5 Asymptotic Properties of Log-likelihood Functions . 510 

12.6.6 Particle Approximation Measures 512 

12.6.7 A Partially Linear/Gaussian Filtering Model .... 513 

12.6.8 Exercises 520 

References 523 

Index 549 




1 

Introduction 



1.1 On the Origins of Feynman-Kac and Particle 
Models 

The field of Feynman-Kac and particle models is one of the most active 
contact points between probability, engineering, and the natural sciences. 
It is hard to know where to start in describing its early contributions. 

The origins of Feynman-Kac formulae certainly started with the work of 
R.P. Feynman who in his doctoral dissertation (Princeton, 1942) provides a 
heiuristic connection between the Schrodinger equation and N. Wiener path 
integral theory. These lines of investigations were piusued and amplified by 
M. Kac in the early 1950s (see for instance [194]). The idea was to express 
the semigroup of a quantum particle evolving in a potential in terms of a 
functional path-integral formula. Intuitively speaking, Feynman-Kac mea- 
sures enter the effects of the potential in the distribution of the paths of 
the particles. This “change of probability” on path space associated with 
a given potential function has considerably influenced several research di- 
rections in mathematical physics, stochastic processes, and other scientific 
disciplines. One of the fascinations of these models today is their use to 
model a rather large class of physical, biological, and engineering prob- 
lems. IVom the point of view of physics, they represent for instance the 
path distribution of a single particle evolving in absorbing tmd disordered 
media (see for instance [295]). In this interpretation, the potential function 
represents a “killing or creation” rate related to the absorbing nature of 
the medium. 




2 



1. Introduction 



More generally, they can be regarded as the Boltzmann-Gibbs distribu- 
tion of certain physical or biological quantities, such as directed polymers in 
physical chemistry or genetic infinite-population models (see [92, 209, 280] 
and references therein). In this context, the potential function can be re- 
garded as a Hamiltonian or an energy function related to internal interac- 
tions or to the selection pressure of the environment. FVom the perspective 
of the engineer or the applied statistician, it usually represents a condi- 
tional distribution of a certain unknown quantity with respect to some 
observation process. This interpretation is currently used in advanced sig- 
nal processing, particularly in filtering estimation and Bayesian analysis 
(see [97, 125] and references therein). In these settings, the potential is 
rather regarded as a likelihood function of the states with respect to some 
observation process or some reference path. 

Stochastic particle algorithms belong to the class of Monte Carlo meth- 
ods. Their soiurces may be found in the foundations of probability theory 
with the pioneering work of J. Bernoulli (i4rs Conjectandi published in 
1716), who introduced the concept of the probability of an event as the 
ratio of favorable outcomes with respect to the number of all possible inde- 
pendent outcomes. A decisive step in the modem development of probabil- 
ity theory was the introduction in the 1920s by A. A. Markov {Calculus of 
Probabilities, 3rd ed., St. Petersburg, 1913) of a theory of stochastic pro- 
cesses that studies sequences of random variables evolving with time. This 
new branch of probability led to rather intense activity in various scientific 
disciplines. The theory of Markov processes provides natural probabilis- 
tic interpretations of various evolution models arising in engineering and 
the natural sciences. One critical aspect of particle methods as opposed 
to any other numerical method is that it provides a “microscopic particle 
interpretation” of the physical or engineering evolution equation at hand. 
Another advantage of these probabilistic techniques is that they do not use 
any regularity information on the coefficients of the models and they apply 
to large scale models. The increasing fascination of these particle methods 
today is their use to solve numerically nonlinear equations in distribution 
space. The nonlinear stmcture of these distribution models induces a nat- 
ural interaction or a branching mechanism in the evolution of the particle 
approximation model. This rather recent aspect of particle methods takes 
its origins firom the 1960s with the development of fiuid mechanisms emd 
statistical physics. We refer the reader to the pioneering works of McK- 
ean [243, 244] (see also the more recent treatments [27, 28, 245, 273, 294] 
and references therein). 

The use of interacting particle methods in engineering science and more 
particularly in advanced signal processing is more recent. The first rigor- 
ous study in this field seems to be the article [75] published in 1996 on the 
applications of particle methods to nonlinear estimation problems. This 
article provides the first convergence result for a new class of interact- 
ing particle models originally presented as heuristic schemes in the begin- 




1.1 On the Origins of Feynman-Kac and Particle Models 3 



ning of the 1990s in three independent chains of articles [164, 163], [204], 
and [43, 205, 104, 105]. These studies were followed by four other arti- 
cles [76, 83, 84, 86] revealing the generality and the impact of these particle 
methods in solving munerically a rather large class of discrete generation 
and abstract nonlinear measure-valued processes. In the same period two 
other independent works [64, 65] proposed another class of particle branch- 
ing variants for solving continuous-time filtering problems. Incidentally, and 
as we noticed in a more recent work [99], all of these mathematical tech- 
niques also apply directly without further work to analyze the asymptotic 
behavior of a class of genealogical tree particle models currently used in 
nonlinear smoothing and path estimation problems. 

Although precise mathematical statements and various detailed applica- 
tions are provided in the further development of this introductory chapter 
(see for instance Section 1.4 pp. 14-29), to motivate this introduction we 
already present one particular example from engineering science and more 
particularly from advanced signal processing, that gives some insights as 
to what this book is about. The example we have chosen is taken from 
[104, 105]. It is also known as “the Singer model” and is often used as 
a simplified radar model. We consider a three-dimensional Markov chain 
Xn — (Xn\Xn\Xn^), n 6 N. The three coordinates represent respec- 
tively the acceleration, the speed, and the position of an abstract target 
evolving in the real line according to the dynamiced equation 

f 

{ = {l-aA)Xl^l,+l3AXii^^ ( 1 . 1 ) 

where (a,)9) is a pair of constant parameters and A G (0, 1) corresponds 
to the radar sampling period. The initial random variable Xq represents 
the unknown random location of the target. The random changes of the 
acceleration coordinates may be modeled by a sequence of independent 
Bernoulli random variables ^n(€ {0,1}) and a sequence of independent 
uniform random variables Wn (G [0,a] for some a G R). The target Xn is 
partially observed by the radar measurements. The observations delivered 
at each time n > 0 by the radar have the form 

The random perturbations in radar measurements (induced for instance 
by the thermic noise in complex electronic devices) are often modeled by 
choosing a sequence of independent Gaussian random variables Vn with zero 
mean and say, imit variance. In Figiure 1.1, we have represented three con- 
secutive radar measurements of the evolving target Xn, Xn+i, and Xn+ 2 * 

The nonlinear filtering problem consists in estimating the conditional 
distribution of the random path (Xq, . . . , Xn) of the target given the ob- 




4 



1. Introduction 



X(ii) 




FIGURE 1.1. Radar processing 

servations Yp delivered by the radar up to time n 

Law(Xo,...,X„|yo,...Tn) (1.2) 

The particle approximation model of these path distributions is constructed 
as follows. First, at time to = 0, we sample N independent locations of the 
target, say (Xj)i<i<Af, with the initial acceleration/speed/position random 
components Then, we evolve randomly these 

initial points according to the djrnamical equation (1.1) up to some fixed 
time, say ti(> to) - In other words, we sample N independent copies X^^ t^ = 
(Xj, . . . , ) of the target from the origin to up to time f i . The likelihood 

W/jj of each of these paths is defined by the formula 

<«.=«? (-5 E 

y to<p<ti j 

Loosely speaking, these [0, l]-valued random exponential parameters mear 
sure the adequacy of the sampled targets with respect to the observation 
sequence. The way to update the initial configuration with respect to the 
radar observations is not unique. J'or instance, we cw select randomly N 
conditionally independent paths Xf^ f^ = (Xq, . . ■ ,Xf^)i<i<ff with respec- 
tive distributions 






Syi 






j=i 2^k=i 



In other words, with a probability 

erwise we replace it by a new one randomly chosen in the current 




1.1 On the Origins of Feynman-Kac and Particle Models 5 




Tune axis 



N«SPMticles 

FIGURE 1.2. Particle radar processing 



configuration with a probability proportional to its likelihood • Af- 
ter this updating stage, we again sample N independent copies = 
Xf^) of the target firom ti up to another fixed time, say t 2 , and 
starting at X(^ = Xl^ . We update this new configuration with respect to 
the next sequence of observations from to t 2 as above by replacing (to, f i) 

by (ti,f 2 ), and so on. 

In Figure 1.2, we have represented the genetic type evolution of TV = 5 
particles. The radar measurements give some information to each exploring 
particle on the evolution parameter of the target. The jumps correspond 
to the selection transition where a particle with poor likelihood prefers to 
select a new site. 

The rationale behind this heuristic-like algorithm is to construct a stochas- 
tic and adaptive grid with a refined degree of precision on the regions with 
high conditional probability mass. This simple example discussed above 
leads inevitably to the following questions: Is this evolutionary stochastic 
grid model well-founded? Is it possible to calibrate the “speed” of conver- 
gence when the precision parameter N tends to infinity? If so, then what 
can we say about the long time behavior of these algorithms? Can we ex- 
tend these ideas to more general optimization and simulation problems? If 
we interpret this genetic type particle scheme as a birth and death model, 
what cam we say about the corresponding genealogical trees? The main 
difficulty in the asymptotic analysis of these algorithms comes firom the 
interacting selection mechanism. As we mentioned earlier, the first well- 
foimded proof of these particle algorithms can be found in [75]. The central 
idea was to connect the desired conditional distributions with a new class of 
discrete generation and Feynman-Kac particle approximation models. Sur- 
prisingly enough, we shall see in the further development of this book that 




6 1. Introduction 



the occupation measures of the genealogical tree model associated with the 
genetic type particle algorithm above converge to the desired conditional 
distribution on path space (1.2). We shall also prove that the ancestral lines 
of each current individual can be regarded with some respect as a collection 
of approximating independent samples of the complete path of the target 
given the observations. 

The idea of duplicating in a dynamical way better-fitted individuals and 
moving them one step forward to explore state-space regions is the basis 
of various stochastic search algorithms. In Section 1.7, we shall provide a 
rather detailed catalog of models arising in engineering sciences, physics, 
and biology built on this natural exploration strategy. In this connection, 
we mention that these heuristic ideas seem to have emerged in biology in 
the be ginning of the 1950s with the article of M.N. Rosenbluth and A.W. 
Rosenbluth on macromolecular simulations [280] as well as in physics with 
the article of Kahn and Harris [177]. 

A more systematic and recent study on particle methods and abstract 
Feynman-Kac models in general metric spaces was initiated in the chain 
of articles [91, 93, 95, 97, 98, 99]. The range of applications of these par- 
ticle techniques is attested by the number of articles in engineering and 
applied statistics smd particularly in Bayesian literature. For instance, the 
book [125] provides a detadled panorama of recent Bayesian applications 
in seemingly disconnected areas such as target tracking, computer vision, 
and financial mathematics, as weU as in biology and in directed polymer 
simulations. Unfortunately, these lines of research seem to be developing in 
a blind way with at least no visible connections with the physical and the 
mathematical sides of this field. 

All these developments also revealed and tied up strong and fruitful con- 
nections with classical genetic algorithms. These models were introduced by 
J.H. Holland in [183] in 1975. During the last thirty years, these powerful 
stochastic search algorithms have been used with success in the numeri- 
cal solution of a wide range of global optimization problems. We refer the 
reader to the chain of articles [5, 191, 192, 258, 285, 299, 306, 309] and ref- 
erences therein. The first well-founded proof of the convergence of genetic 
algorithms towards a set of global minima of a potential function on a finite 
state space was due to R. Cerf in 1994 in a chain of articles [45, 46, 47, 48]. 
This line of research was extended and simplified in [94]. These last refer- 
enced articles provide respectively a large-deviation analysis and a semi- 
groups approach combined with log-Sobolev inequalities to study the con- 
centration properties of genetic algorithms with fixed population size as 
the time tends to infinity. 

One of the main objectives of this book is to provide a unifying treatment 
on Feynman-Kac and particle methods. Most of the book is concerned with 
abstract mathematical models in general measurable state spaces. Precise 
applications will be discussed in full detail in a separate chapter. In each 
application model area we consider, we will provide a specific interpretation 




1.2 Notation and Conventions 7 



of these abstract Feymnan-Kac and particle models with a detailed list of 
contributions and references. An important part of the book is concerned 
with the as}rmptotic behavior of particle models as the size of the systems 
tends to infinity. Special attention is paid to the delicate and probably the 
most important problems of the long time behavior of particle algorithms. 

In the further development of this preliminary chapter, we provide a de- 
tailed introduction on discrete and continuous time Feynman-Kac models 
and genealogical and interacting particle methods. We imderline the fun- 
damental concepts and the general mathematical structure of these models 
leaving aside precise constructions with “unnecessary” technical assump- 
tions. We motivate the forthcoming development of the book with illus- 
trations of these abstract mathematical models on several concrete exam- 
ples from advanced signal processing, microstatistical mechanics, polymer 
chemistry, and applied statistics. We also provide several particle interpre- 
tations in connection with the application model areas we consider. We 
finally discuss the connections between these particle models and a class 
of existing Monte Ceu’lo algorithms currently used in Bayesian statistics, 
quantiun physics, and operations research. We end the chapter with a de- 
tailed guide to the contents of this book. 



1.2 Notation and Conventions 

In this preliminary section, we have collected some basic notation and con- 
ventions that we have tried to keep consistently throughout the book. 

We denote respectively by N, Z, and R the fields of all positive integers, 
the set of all integers, and the field of all real nimibers. 

We denote by M{E) the set of bounded and signed measures on a given 
measurable space {Ey S). By Mo{E) and M^{E) C M(E), we denote re- 
spectively the subset of measinres with null total mass and the subset of 
positive measures. Finally, P(E) and Bb(E) denote respectively the set of 
probabiUty measures and bounded measurable functions on a given mea- 
surable space {E^£), As usual Bh{E) is regarded as a Banach space with 
the supremum norm 

ll/ll = sup |/(l)| 

xeE 

We shall slightly abuse the notation and denote by 0 and 1 the zero and 
the unit elements in the semirings (R, -I-, x) and {Bb{E), -I-, x). We always 
assume implicitly that {x} € € for any x € E and we write 5x, the Dirac 
measure at x. Unless otherwise stated, the set M{E) is endowed with the 
(T-algebra generated by Bb{E); that is, the coarsest <r-algebra on M(E) 
such that the linear functionals 




8 



1. Introduction 



are measurable. We also denote Osci(£), the convex set of 5 -measurable 
functions / with oscillations less than one; that is, 

osc(/) = sup{l/(x) - f{y)\ , x,y€E}<! 

We also use the notation, for any / € Bb{E), 

ll/IU = ll/ll +osc(/) 

(.)■'■, (.)", and [.J denote respectively the positive, negative, and inte- 
ger part functions. The maximum and minimum operations are denoted 
respectively by V and A; 

aVfc = max (a, 6), a"*" = oVO 

aAb = min (a, 6), -a~ = a AO 

We extend the preceding operations on the set of functions Bb{E). For 
instance, for a given / € Bb{E), we denote by /■•■ and -/“ its positive and 
negative parts 



/■•■(x) = /(x) VO and -/ (x) = /(x)aO 

For any pair of signed measures /x, »/ € X(f^), we say that fi is absolutely 
continuous with respect to t] and we write /x < t; if rf{A) = 0 whenever 
n{A) = 0 , A eS. When h<T), sometimes we say that rj is the dominating 
measure of fi. We recall that the Radon-Nikodym derivative of y, with 
respect to a dominating measure t] is the unique function x £ E — >• ^(x) 
(up to sets of tj-measure zero) such that for any A£ £ 



V{A) = ^(x)q(dx) 



The relative entropy Ent(/xi \y,2) and the total variation distance ||/ii -/i2||tv 
between probability measures /xi, /X2 € V{E) are defined by 

Ent(/xi|/X2) ^ dyi 

if /xi < /X2 and oo otherwise, and 



llMl-M2||tv = SUp|/Xi(i4)-/X2(i4)| 

= ^8up{|Mi(/)-M2(/)|; f^BbiE) -, ll/ll < 1 } 

For a distribution /x G V{E) and p > 1, we also write ||.||p,^, the Lp(/x)- 
norm 




1.2 Notation and Conventions 9 



A (bounded) integral operator from a measiurable space (Eo,€o) into an- 
other measurable space is an integrtd kernel M(xo,dxi) such that 

for any ( xq , Ai) € {Eq x £i) we have 

• M(xo, .) G M{Ei) and supj,^^^^ \M{yo, £^i)| < oo. 

• The mapping yo€ Eq-> M{yo, A) is a 5o-measurable function. 

We say that an integratl operator M is a Markov kernel from Eq into Ei 
when we have M(xo, .) G V{Ei) for any xo G Eq. 

We also recall that any integral transition Mi(xo,dxi) from a measm- 
able space {Eo,€o) into another measurable space {Ei,€i) generates two 
operators, one acting on bounded 5i -measurable functions /i € Bb{Ei) and 
taking values in Bb{Eo) 

V(xo,/i) G X Bb{Ei)) , (Mi/i)(xo) = f Mi(xo,dxi) /i(xi) 

JEi 

and the other one acting on measures no G M(Eo) and taking values in 
M(Ei) 

V(/io, Ai) G (P(Eo) X Si) , (fioMi)(Ai) = f fio{dxo) Mi(xo, Ai) 

JEo 

Finally, if M 2 (xi,dx 2 ) is a Markov transition from {Ei,Si) into another 
measurable space (E 2 ,S 2 ), then we denote by Mi M 2 the composite opera- 
tor 

(MiM2)(xo,dx2) = / Mi(xo,dxi) M2(xi,dx2) 

JEi 

For any R**- valued function / = (f')i<i<d G Bb{Ei)'^, any integral operator 
M from Eq into E\, and any /x e M(Eo) and (x\ . . . ,x‘*) € Eq, we will 
slightly abuse the notation, and we write M(/) and /x(/) the R'^-valued 
function and the point in R** given by 

and M/) ).•••. M/**)) 

For any x e Eq and G Bb{Ei), we also simplify the notation and we 
write 

M[(v?* - Mtp^) - M(^^)](x) 

instead of 

M\(v' - M(y>)(i)) (/ - M(^)(i))|(i) 

= M{ip^(p^){x) - M(v?^)(x) M{ip^)(x) 

Throughout this book, we will consider various collections of Markov 
kernels /f,^(xo,dxi) from a measurable space (Eo,So) into another mear 
surable space (Ei,Si) and indexed by the set of all probability measures 




10 



1. Introduction 



T)o € ViEo). To avoid repetition, we will dways implicitly suppose that for 
any A\ € Si the mappings 

(xo,»/o) € ViEo) — > Kr,o{xo,Ai) G [0, 1] 

«ire measurable. In other words, Kr^{xo,dxi) can be regarded as a single 
Markov kernel from {Eo x ViEo)) into Ei. 

Let (£?„,£„), n G N, be a collection of measmable spaces, and let M„ be 
a sequence of (bounded) integral operators from En-i into We use the 
notation Mp,n to denote the integral operator from Ep into E„ defined by 

“ ■Mp+l'Mp.j.2 . . . With -Myi.n “ -fd 

For time-homogeneous models, we simplify the notation and we denote by 
M” the n iterates of the operator M 

= with M° = Id 



We examine in this book discrete and continuous time mathematical 
models. To distinguish the discrete and continuous time indices, we use as 
traditionally the letters n,p,q G N for discrete time parameters and the 
letters r, s, t G R+ = [0, oo) for continuous time parameters. 

Given a sequence of measurable spaces {En,Sn) we often use the letters 
Xn, Vn, to denote the points in E„. By /„ and fin we denote respectively 
a given bounded measurable function and a probability measure on E„- To 
simplify the presentation sometimes we also use the notation 

n 

VO ^ p -^[p,n] ~ Eq = {,Ep X ... X £n) and £(p_„] = £<[p+x^n) 

q=p 



We shall often use the letter c to denote any nonnegative universal 
constant whose values may vary from line to line but do not depend on 
the time parameter nor on the Feynman-Kac model. We shall denote by 
a = (a(p))p>o OT b = {b{p))p>o a sequence of nonnegative universal con- 
stants whose values may also vary from line to line. We shall use the letter 
a when the sequence does not depend on the Feynman-Kac model and the 
letter b in the opposite situation. 

We also use the conventions 

= 0 , = 1 > sup = — oo , inf = - 1-00 

0 0 ® ® 




1.3 Feynman-Kac Path Models 11 



1.3 Feynman-Kac Path Models 

The main object of this section is to present the Feynman-Kac path distri- 
bution models discussed in this book. For the time being, these models are 
described in a somewhat heuristic way. The full mathematical description 
will be given in the forthcoming development of Chapter 2. 



1.3.1 Path-Space and Marginal Models 

In the discrete time situation, Feynman-Kac path measures are tradition- 
ally defined by the following formulae: 

Q„(d(xo,...,x„)) = ^ Pn(d(xo,...,x„)) (1.3) 

The measure represents the probability distribution of the path se- 
quence (Xq, . . . , Xn) of a Markov chain X taking values in some measurable 
space {E, S), and the potential functions G„ are 5-measurable nonnegative 
functions such that the normalizing constants are well-defined; that is, for 
any n € N, 




P„(d(xo, . . . , x„)) € (0, oo) 



When the potential functions Gn = expVn are related to some energy 
function Ki, the measures Qn can be written in the more familiar form 



Qn(d(xo, . . . , Xn)) = — exp ^ ^ Vp{Xp) ^ Pn(d(Xo, . . . , Xn)) 



I Pn( 



The continuous time models are defined simileirly by the Feynman-Kac 
path measures 



dQt = J- exp|jfV,(X,)ds| 



dP 



(1.4) 



where P is the distribution of a canonical Markov process Xt taking values 
in some measurable space {E,£) and ^ € R+, is a collection of measur- 
able functions such that the normalizing constants are well-defined in the 
sense that 



= E ^exp IjfV.(X.)dsJ) e(0,oo) 



Even if they look innocent, these Feynman-Kac path measures are very 
complex mathematical objects. To get some feeling of their complexity, we 




12 



1. Introduction 



note that the continuous normalizing constants Zt are expressed in terms 
of functional integrals on path spaces. In the same way, solving the discrete 
time constants requires the computation of n integrations over {E,S). 

To get one step further in our discussion, it is convenient to introduce 
the terminal time marginals of these path distributions. In the discrete 
time case, the corresponding distribution flow n € N, is defined for any 
bounded ^-measurable function / by the Feynman-Kac formulae 



»?n(/) = 7n(/)/7n(l) with 7„(/) 



= E^/(Xn) 




(1.5) 



The continuous time marginal distribution flow t/t, t € R+, is defined in 
the same way by the Feynman-Kac formulae 

»h(/)=7t(/)/7«(l) with 7t(/) = E(/(X,) exp|jfV,(X,)ds|) 

A simple calculation shows that the normalizing constants can be expressed 
in terms of the normalized distribution flow with the formulae 



n-l -t 

Zn = 7n(l) = n and = 7t(l) = exp / ri,{V»)ds 

p=0 •'0 



This shows that the unnormalized flow can be computed in terms of the 
normalized distributions. More precisely, we easily deduce the following 
representations from the display above: 



n-l .t 

7n(/) = Vnif) n Vp{Gp) and 7 t(/) = rft{f) exp / T/,(V;)ds (1.6) 

p=0 

The second important observation is that the marginal distributions have 
the same mathematical structure as the path measures. To make precise 
this fundamental stability property, we suppose we are given an auxiliary 
Markov chain taking values in some measurable space {E',S'). We 
associate with the sequence of n-stopped processes 

^n = (A;^„)p>o€£; =(£;')" 

It can be easily verified that Xn is again a Markov chain taking values 
in the set of fJ'-valued coimtable sequences. We further suppose that the 
potential functions (?„ only depend on the nth time value of the stopped 
chmn; that is, we have that 



G„(x„) = G„((x;^jp>o) = g;(x;) 




1.3 Feynman-Kac Path Models 13 



for some ^'-measurable function In this situation, r}n are distributions 
on and their marginals Q'„ € on the first (n-l- 1) coordi- 

nates coincide with the Feynman-Kac path measure defined as in (1.3) by 
replacing the ptur (X„,Gn) by {X'„,G'n). It also follows that the marginal 
distributions rf„ G V{E') of with respect to the terminal time coincide 
with the Feynman-Kac measure defined as in (1.5) by replacing the pair 
{Xn,Gn) by {X^,G'n). We can summarize the structural property above 
with the following synthetic diagram 

path measure n-time margin6d 

Qn < Vn > Vn 

Similar arguments apply to continuous time models, but the construction of 
the stopped path process is technically more involved (see for instance [72] 
and references therein). 

These two apparently innocent observations will in fact be essential in the 
development of this book. The first pair of formulae (1.6) lead to a natural 
way to define a particle-unbiased estimate of unnormalized Fejmman-Kac 
flows. It is also the basis for a semigroup and martingale methodology for 
studying the asymptotic behavior of particle models. The second observar 
tion on the path Markov chain is essential in the construction of genealogi- 
cal tree-based models. It also aUows direct transfer of several mathematical 
results on the time marginals to path-space models. 

1.3.2 Nonlinear Equations 

In some research areas, such as nonlinear filtering and nonlinear differential 
equations literature, the Feynman-Kac distribution flow models are alter- 
natively defined as a solution of a nonlinear and measure-valued equation. 

To better connect our abstract models with these subjects, we give next a 
brief description of these alternative representations. One drawback of this 
modeling technique is that it often requires more regularity conditions on 
the test functions as well as on the underlying Markov process and potentied 
functions. On the other hand, this measure-valued process approach gives a 
strong basis for constructing particle approximation models. For all of these 
reasons, we have chosen to devote a brief introduction on the d)rnamical 
structure of Feynman-Kac distributions. 

In the discrete time situation, we denote by Mn the Markov kernel of 
the chain and denote by the Boltzmann-Gibbs transformation from 
the set of probability measures ri on E into itself defined by 

^n{v){dx) = G„(x) T}(dx) 

To simplify the presentation, we assume that the potential functions are 
strictly positive so that the mapping is well-defined on the whole set of 




14 



1. Introduction 



distributions. Using the Markov property and the multiplicative nature of 
the Feymnan-Kac models, we check that the distribution flows and Vn 
satisfy the reclusive equations 



7n+l — 7nQn+l and 1/n+l — ^ n{Vn)^n+l 



with 

Q„+i(/)(x) = Gn(x)M„+i(/)(x) 

and 

M„+i(/)(x)= I Mn+i{x,dy) f{y) 

J E 

As we already mentioned, the continuous time situation is technically 
more involved. To keep things as simple as possible, let us assume that 
E is & Polish space and Xe is an E-valued Markov process with time- 
inhomogeneous infinitesimal generator Lt. Under appropriate regularity 
conditions, the distribution flows 7 t and rjt satisfy for sufliciently regular 
test functions / the nonlinear equations 

|7t(/) = 7t(It(/)) + 7t(/Vi) 

and 

j^Thif) = ThiLtif)) + thif (Vt - T/t(V't))) (1.8) 

Even if they look iimocent, the equations (1.7) and (1.8) can rarely be 
solved analytically, and their solution requires extensive caJculations. 



1.4 Motivating Examples 

The abstract Feynman-Kac models presented in Section 1.3 axe at the 
comer of diverse disciplines. We give next a nonexhaustive list of the ap- 
plications discussed in this book together with some comments concerning 
the Feymnan-Kac interpretation of nonlinear estimation problems. In each 
of the application model areas, we provide several motivating and illumi- 
nating examples for the forthcoming particle algorithms developed in this 
book. The reader who wishes to know more details about a more specific 
application is recommended to consult Chapters 11 and 12, entirely devoted 
to applications of Feynman-Kac 8md particle methods. 

1.4.1 Engineering Science 

Feynman-Kac path measures are currently used in engineering science and 
particularly in financial mathematics, signal processing, and nonlinear fil- 
tering problems. They often go by various names, such as the Bayesian 




1.4 Motivating Examples 15 



posterior, the conditional distributions of signal, or the Boltzmann-Gibbs 
measure, depending on the model areas. In rare event analysis, they rep- 
resent the distribution of a Markov process in the rare event regime (see 
Section 12.2.5). They also appear naturally in the mathematical descrip- 
tion of certain statistical methods. For instance, they represent a change of 
reference probability measure in importance sampling techniques (see Sec- 
tion 2.4.2). A full description of all of these models would of course be too 
much digression. Some of them will be discussed in the further development 
of Chapter 12. Because of their importance in practice, we have chosen in 
this introduction to concentrate on application models in advanced signal 
processing, particularly in nonlinear filtering problems. 

Nonlinear Filtering 

We recall that the filtering problem consists in computing the conditional 
distribution of a path signal process given its noisy and partial observations. 
In the discrete time situation, the state signal is an R^-valued Markov chain, 
usually defined through a recursion of the form 

Xn = Fn{Xn-uWn) 

where Xq and Wn are independent and R^- valued random variables and 
is a collection of Borel functions from into R^. The recursive equation 
above may represent the random evolution of a target in tracking problems 
(see [253]), the evolution of an aircraft in radar processing (see [103]), or 
inertial navigation errors in GPS signal processing (see [43]). The noise 
sequence Wn has different possible interpretations. First, it represents the 
uncertainties in the choice of the stochastic mathematical model. More 
interestingly, it models some unknown quantities we want to estimate. For 
instance, in tracking problems, Wn corresponds to the unknown control 
laws of a noncooperative target. 

The state of the signal is not directly observed. The observation process is 
traditionally defined in terms of a sequence of R^ -valued random variables 
given by 

Yn = Hn{Xn^Vn) 

The perturbation sequence Vn consists of R^ -valued independent random 
variables independent of the signal X, Hn is sl measurable function from 
into R*^ . We further suppose that the distributions of the random 
variables Hn{x, Vn) have the form 

?T0hs,{Hn{x,Vn) € dy) = 9n{x,y) Qnidy) 

where Qn is & given positive measure on R^ and gn{x^ •) a density function. 
The statistical nature of the perturbation sequence Vn depends on the form 
of the sensors. For instance, they may represent thermic noises resulting 
from electronic devices, the uncertainties in the sensor model, or unknown 




16 



1. Introduction 



quantities such as the atmospheric propagation delays or clock bias in global 
positioning system (GPS) processing. For more details, we refer the reader 
to the set of referenced articles. 

Under our statistical assumptions, it is convenient to note that the ob- 
servation variables (Vo, . . . , Yn) are independent conditionally on the signal 
path (Xq, . . . , Xn). That is, we have in a symbolic form 



Proba {{Yo , . . . , r„) € d{yo , . . . , j/„) | (Xq, . . . , X„)) 

= rip=o Qp{dyp) 

If we take the nonhomogeneous potential functions 



(1.9) 



Gn{x) = 9n{x,yn) 

in the Feynman-Kac path model (1.3), then from Bayes’ rule it becomes 
intuitively clear that 

Q„ = Law ((Xo, . . . , X„) I (To, • • • , Yn-i) = (yo, • • • , 1/n-i)) 



Continuous time problems are defined in terms of a pair signal/observation 
Markov process {St, Yt) taking values . It is the solution of a pair of 
ltd’s stochastic differential equations 

dSt = A{t, St)dt + B{t, St) dWt -I- f C{t, St- , u) {n(dt, du) - v{dt, du)) 

JR”' 



and 

dYt = HtiSt)dt + adVt 

(U, W) is a (d„ + du,)-dimensional standard Wiener process, ct is a strictly 
positive parameter, /x is a Poisson random measure on R+ x R**" with inten- 
sity measure v{dt,du) = dt® F{du), and F is a positive (T-finite measure 
on R**". The mappings A : R+ ^ R'‘, B ; R+ x R** R** ® R‘*'», 

C : R+ X R** X R**" -> R*^, and /f : R+ x R** -> R**" are Borel functions, 5o 
is a random variable independent of (U, W, y), and Vb = 0. Here again the 
first equation represents the evolution laws of the physical signal process 
at hand. For instance, the Poisson random measure y may represent jump 
variations of a moving and noncooperative target (see for instance [105]). 

The traditional nonlinear filtering problem in continuous time is to es- 
timate the conditional distribution of the signal St given the observations 
Yg from the origin s = 0 up to time t. The Kallianpur-Striebel formula 
(see for instance [195, 262]) states that there exists a reference probability 
measure Po imder which the signal and the observations are independent. 
In addition, for any measurable function ft on the space D([0,t],R‘*) of 
R<*-valued cldlag paths from 0 to t, we have that 



mt{{Ss)s<t) I yt) = 



Mft{{Ss)s<t) Zt{s,Y) I yt) 

Eo(Zt(5,y) I yt) 




1.4 Motivating Examples 17 



where s <t) represents the sigma-held generated by the obser- 

vation process and 

log Zt{s,Y)= /‘/r,(5.) dr,- f h:{s,)h,{s,) ds 
Jo Jo 

This filtering problem is also related to the numerical solution of some 
nonlinear stochastic partial differential equations. To be more precise, we 
introduce the end time marginals of the preceding conditional distributions 
defined for any bounded Borel function / on R** by 

Uf)=wiSi)\yt) 

For sufficiently regular test functions, we can prove that the optimal filter 
fit satisfies the Kushner-Stratonovitch equation 

dfitif) = fh{Lt{f))dt + fitHH-fitiH))* f){dYt-fit{H)dt) 

where Lt represents the infinitesimal generator of St- Next we examine 
three situations; 

• Discrete time formvlation: 

Let t„, n > 0, be a given time mesh with to = 0 and t„ < tn+i- Also 
let X„ be the sequence of random variables defined by 

■Afl ^[^nt^n+ll 

By construction, X„ is a nonhomogeneous Markov chain taking val- 
ues at each time n in the space F„ = D{[tn, tn+il> R**)- FVom previous 
considerations, the observation process Yt can be regarded as a ran- 
dom environment. Given the observation path, we define the “rem- 
dom” potential functions on E„ = D([t„,t„+i],R‘*) by setting for 
any x„ = (xn(a))t„<*<tn+i ^ 

H,(xn(s)) dYg - H*{xn{s))H,{xn(s)) ds 

By construction, we can check that the quenched Feynmtm-Kac path 
measures (1.3) associated with the pair (X„,G„) coincide with the 
Kallianpur-Striebel representation and we have 

Q„ = Law (5(to,tj), . . . , I 



• Discrete time observations: 

Suppose that the observations are only delivered by the sensors at 
some fixed times ^ > 0, with tn < tn+i- Also suppose that we are 
interested in computing the conditional distributions 

Law (5[to,t.], • . . , 1 Ft, , ... , Ft„+. ) 




18 



1. Introduction 



Notice that 



{Yt 



n + 1 




H,{Ss)ds + a{Vt„,,-VtJ 



a (Vt„+, - Vt„) are independent and random variables with Gaus- 
sian density p„. Arguing as in the discrete time formulation eind us- 
ing the same notation as there, we introduce the “random” potential 
functions G„ defined for any x„ = (®n(s))t„<»<t„+i € En by 




By construction, the quenched Feynman-Kac path measures (1.3) as- 
sociated with the pair (Xn, G„) now have the following interpretation 



Qn — (5(to,t||> • • • ) I ) • • • > Yt„) 



• Robust equation: 

The precise description of the robust equation is technically more 
involved than in the discrete time interpretation. The idea is to re- 
move the stochastic integrals in the exponential terms to work with 
a robust pathwise version of the conditional distributions. More pre- 
cisely, using the Girsajiov formula, we can construct a novel reference 
probability measure P so that 



mt{{s>)s<t) I yt) = 



E(/t((g.).<t) Ms,Y) I yt) 
E{Zt{S,Y) I yt) 



with ^ 

logZt{S,Y) = H:{St)Yt+ f UsiSMds 

Jo 

where 17, is a given collection of measurable functions on In 

contrast to the previous change of reference measure under P, the 
canonical Markov process St now depends on the observations. This 
robust description is clearly related to the continuous time Feynman- 
Kac path measures (1.4). For instance, the robust version fjy^t of 
the optimal filter is now given for any continuous observation path 
y = {yt)t>o by the formula 



VvAf) 






The distribution fiow rjy^t is defined by the Feynman-Kac measures 



Vy,t{f) — 7v,t(/)/7y,f(l) 




1.4 Motivating Examples 19 



and 

7v,t(/) = E (j{Xf) exp U,{Xy,y,)ds^ ) 

where Xf is & Markov process with the same law as the signal process 
under P. The robustness property ensures that the pathwise version 
of the optimal filter is a continuous function with respect to the obser- 
vation process. In practice, this robustness property is a fundamental 
requirement of any filter, as small variations of the observation se- 
quence should not influence drastically the solution. 

Examples 

Nonlinear filtering problems have become increasingly important in engi- 
neering and operations research literature. In order to provide a concrete 
basis for the further development of this book, we propose hereafter some 
examples of application areas as well as some discrete time filtering prob- 
lems in their simplest form. For a more thorough discussion, we refer the 
reader to Section 12.6. Next, we examine three estimation problems corre- 
sponding respectively to tracking analysis, stochastic volatility estimation, 
and speech recognition. In each situation, we describe the choice of the 
potential function for which the corresponding Feynman-Kac measures are 
versions of the desired conditional distributions. In all situations, the un- 
derlying Markov model coincides with the signal process. 

1. One traditional situation in tracking problems is to estimate the lo- 
cation of a moving target X„ in the quarter plane E = R+ with 
a fixed observer at the origin (0,0) taking angular measurements 
Yn- One of the simplest models for the signal is the Markov chain 
Xn = € R+ defined by the recursive equations 

/ 

The velocities W\,W^ and the initial conditions Xq,Xq are indepen- 
dent and identically distributed random variables. The noisy angular 
positions delivered by the sensor are given by the equation 

Y„ = arctan(X2/X;t) -h V;. 

The perturbations Vn are assumed to be independent, identically dis- 
tributed, and centered Gaussian random variables with unit variance. 
In this example, we clearly have 

Proba((arctan(xVx') -I- V„) € dy) = e-i(»-arcta„(xVx‘))^ 

V 27T 




20 1. Introduction 



and we can alternatively choose the potential functions 
G„((xSx^)) = exp (-(y„ - arctan(x*/x^))^/2) 



or 



G„((x^x^)) = exp (-(arctan(x^/x^))^/2 + y„ arctan(x^/x^)) 

2. In financial engineering and economics, one recent area of research is 
the estimation of the stochastic volatility of the price of a given asset. 
In this context, the signal X„ represents the random evolution of 
the logarithmic volatility and the observation sequence Yn represents 
the change of amplitude of the return series. The simplest filtering 
model associated with this volatility estimation problem is described 
inductively as 

/ = aX„-i + bWr, 

\ Yn = 

where Wn, K>, and Xq are independent random sequences, a,b,c€ R, 
and is a centered Gaussian distribution with unit variance. In this 
example, we have 

Proba ((x + c + log V^) £ dy) 

= ^ l,Jy) dy 

and we can take 

G„(x) = exp - (logj/^ - (x + c))]^ 

3. In speech separation analysis, we want to recover mutually indepen- 
dent signals while observing some noisy mixture of them. This esti- 
mation problem is also called the blind source separation problem. It 
is traditionally defined by taking a system of d sources transmitting a 
d-dimensional signal X„ = (Xn)i<t<d satisfying a system of recursive 
formulae of the form 

/ 

\ • = 1 d 

where Xq and Wn are independent and real-valued random variables 
and are measurable functions from into R. We observe a mix- 
ture of these signals usually defined in matrix form as follows 



yn = ^nX„-|-V'n 




1.4 Motivating Examples 21 



where are {d' x d) matrices. The perturbation sequence Kj is a 
collection of independent random variables with a density distribution 
gn{v) with respect to the Lebesgue measure on R**" . Arguing as before, 
we can take the potential functions 

Gn(x) = QniVn ~ A„x) 



4. Data assimilation methodologies often refer to high-dimensional es- 
timation problems such as those arising in forecast prediction analy- 
sis. In this application area, the signal process may represent an 
ocean dynamic model such as the Mi6uni Isopycnic Coordinate Ocean 
Model [33], an Indian Ocean model [140, 141], the classical Lorentz 
attractor [266], or a barotropic ocean model [198]. In this context, 
the measurements Yn represent acoustic tomography data (see [198]) 
or sea level anomalies and sea surface temperature data (see [140]). 
Here again, the problem is to estimate the conditional distributions 
of the signal given its noisy and partial observations. In forecast and 
oceanographic model literature, these classical filtering models are 
also called sequential data assimilation. 

1.4 2 Bayesian Methodology 

Many estimation problems arising in operations research can be thought 
of as filtering problems. This point of view is at the heart of Bayesian 
methodology. In this branch of statistics, the conditional distributions (1.9) 
are the so-called “Bayesian posterior” and the path distribution of the 
signal is called the “prior” . Although most of the research articles in this 
field are often written in a somewhat hemristic way, the Bayesian literature 
abounds with applications of nonlinear filtering models in many research 
areas, including neural networks, robot localizations, time series estimation 
and target recognition. We refer the interested reader to the set of articles 
in the collective book [125]. To better connect the Feynman-Kac measures 
(1.3) with the “Bayesian language,” we observe that 

Q„(d(xo,...,x„)) 

~ Qn— l(^f(Xo, • ■ • , Xn— 1 )) Gn— l(Xfi— 1 ) Mn{Xn—l,dXn) 

where M„ stands for the Markov transition of the chain X„. To get rid of 
the normalizing constants, we use the proportional sign oc and we rewrite 
the expression above as 

Qn(d(xo, . . ■ , Xn)) oc Qn— l(d(Xo, • . . , Xn— 1 )) Qni^n—ltdXn) (1.10) 
with the positive integral operator 

Qni^n—lfdXfi) = Gfi— l(Xn— 1 ) M,i(Xn— 1, dXn) 




22 



1. Introduction 



Also notice that any positive integral operator can be written as above. 
To prove this elementary observation, we simply take 

G„_i(x„_x) = Q„(l)(xn_i) and M„(x„_i,dx„) = 

This alternative representation of the Feynman-Kac path measures (1.3) 
is commonly used in Sequential Monte Carlo literature. Since various au- 
thors in this field seem to be reluctant to use measure theory, the formula 
displayed above is more often written in terms of distribution density func- 
tions. In the further development of this book, we will see that (1.10) is 
in fact equivalent to writing a measure-valued dynamical equation. As a 
result, most of the algorithms discussed in Sequential Monte C*urlo liter- 
atiure coincide with the particle approximation model of these nonlinear 
equations in distribution space. 

1.4-3 Particle and Statistical Physics 

In physics, Fe)mman-Kac formulae occur in a variety of topics, such as 
trapping problems, Schrodinger equations, quantum physics, and micro- 
statistical mechanics. 

Tirapping Analysis 

In the discrete time situation, we can formally model a killed particle mo- 
tion in an absorbing medium by “adding” in the random evolution of a 
Markov chain X„ a trapping mechanism 

tr&ppiiiK cxplor&tion 

Xn > > X„+i 

The trapping transition consists in killing the particle at site with a 
probability (1 - C?n(A^n))i where G„ is a [0, l]-valued potential function. 
Notice that the particle is not trapped when visiting regions where the 
potential function G„ is equal to 1. The opposite regions, where Gn is equal 
to 0, are called hard obstacles. “Soft obstacles” correspond tojegions in the 
medium where G„ G (0,1). Dining the exploration phase X„ — > Xn+i, 
the particle X„ simply evolves in the medium to a new location X„+i 
randomly chosen according to the distribution Mn+i{Xnj .)• ^ denote 

the lifetime of the particle. By construction, the normalizing constants Zn 
in (1.3) represent the probability that the particle Xn is still aUve at time 
n and 

Qn = Law((Xo,...,^n)|T>n) 

The continuous time Feynman-Kac path formula (1.4) can also be inter- 
preted as the distribution of a trapped Markov motion. In this context, 
the killed Markov process evolves randomly in the medium according to an 




1.4 Motivating Examples 23 



L-motion where L is an infinitesimal generator. It is kiUed at rate U{Xt), 
where U is a nonnegative potential function on the medium. By taking 
Vf = -U in (1.4) and P as the distribution of the Markov L-motion, we 
end up formally with 



Qt = Law((X,),<t \T>t) 
where T stands for the lifetime of the killed process. 

Schrodinger Operators 

The Feynman-Kac path distributions (1.4) are also closely related to Schro- 
dinger equations. To describe these interesting links, we further suppose 
that £ = R*' and Xt is a time-homogeneous Markov process with infinites- 
imal generator L. We also suppose that the potential function is time- 
homogeneous with VJ = V and we denote by V'*’ amd - V~ its positive and 
negative parts. In quantum physics and microstatistical mechanics, the po- 
tentials V’’*’ and V~ are respectively called the “creation” and “killing” 
potentials (see [254]). When V'^ is null, we have V = -V“. In this situa- 
tion, the Feynman-Kac positive measures 

7 t(/) = E(/(Xe) expj- jTV-(X.)ds| ) 

satisfy for sufiSciently regular test functions / the linear equation 

|7t(/)=7t(L”(/)) 

with the Schrodinger operator = L — V~ An this case, we can argue as 
above and interpret 7t as the distribution of a single Markov particle evolv- 
ing in a random medium with killing rate V ~ . This interpretation does not 
hold true when the potential V has a “creation” component. To understand 
the role of it is convenient to work with the normalized distribution 
flow r]t{f) = 7t(/)/7t(l)* We recall that rit satisfy for sufficiently regular 
test functions / the nonlinear equation 

= i,m)) + vdHv - n(vm ( 111 ) 

As traditionally, we can interpret tjt as the law of a nonhomogeneous 
Markov process Xt with a (nonunique) nonhomogeneous infinitesimal gen- 
erator Lrft satisfying the compatibility condition 

for any distribution q on R**. We can choose for instance 

= L Ln + L,j 




24 



1. Introduction 



with the jump type generators 

(/)(*) = j {fiy)-f{x))v{dy) 

= / {m-f{x))V+{y)ri{dy) 

The corresponding Markov process Xt is a jump type nonlinear Markov 
process. Between the jumps, it evolves randomly according to an L-motion. 
At rate V", it is kill^ and instantly jumps to a new site randomly chosen 
according to the current distribution r}t. The “creation” term induces an 
auxiliary nonhomogeneous killing rate At this rate, the particle dies 

and instantly a particle randomly chosen with a distribution proportioned 
to V'^{x) r}t{dx) splits into two offsprings. 

The interpretation above is coimected to the construction of the Schro- 
dinger process using renormalizing techniques. We refer the interested reader 
to the book [254]. 

We also mention that for sufficiently regular L-motions the top eigenvalue 
X(V) of the Schrodinger operator V = L + V can be formulated in terms 
of the long time behavior of the nonlinear distribution flow model (see 
Section 12.4 and the article [102]). For instance, for sufficiently stable L- 
motions, we will see that 



A(V)=limi fv,{V)ds 

t— kOO Z Jq 

In the reversible situation, we can also prove that \{V) coincides with 
the top of the spectrum of Schrddinger operator L" and the density of 
the stationary distribution of the flow T)t (with respect to the reversible 
measure) is proportional to the corresponding eigenvector. 

Interacting Jump and Boltzmann IVpe Models 

Nonlinear equations of type (1.11) can be interpreted as evolutionary type 
interacting jiunp models. The quadratic structure of the evolution can be 
extended to the situation where the potential function U acts on the tran- 
sition space E X E. The corresponding equation is now given by 

ivtif) = + f (fix) - f{y)) U{x,y) rit{dx)T)t{dy) 

at JExE 

Here again, we have a Feynman-Kac interpretation. Simply note that the 
“implicit Feynman-Kac” flow defined by 

mif) = E (/(Xt) exp I jmX,,x) - U{x, X.)]7,,(dx)ds I) 




1.4 Motivating Examples 25 



satisfies the desired equation. In addition, using the same arguments as 
in [98], one can check that the fiow is the unique solution of the integral 
equation 

Thif) 

= VoiPoAf)) + f f [P»Af)ix) - P»Af){y)] U{x,y) rttidx) n»{dy)ds 

Jo JExE 

where stands for the semigroup of X. 

These continuous time Feymnan-Kac models can also be interpreted as 
particular examples of generalized and spatially homogeneous Boltzmann 
models introduced by S. M616ard in [245] and further developed in a chain 
of articles [165, 166, 246, 247, 248]. Whenever it exists, the Feynman-Kac 
representation often provides natural semigroup and martingale techniques 
to analyze the long time behavior of the flow and/or the asymptotic behav- 
ior of the interacting jump approximation models. We refer the interested 
reader to the articles [95, 97, 98, 102]. 



1.44 Biology 

In biology, Feynman-Kac formulae also occur in a variety of topics, such as 
chemical polymerizations, and genetic and genealogical population models. 

Directed Pol)niiers 

In this application model area, the path distributions (1.3) represent the 
Boltzmann-Gibbs measvures associated with a random and directed polymer 
chain. In this context, the underlying Markov chain represents random 
chemical polymerizations in a given solvent E. To simplify the presentation, 
we further suppose that Xn is a nonhomogeneous Markov chain 

Xn = Xfo,„] =def. (^0. . . . , x;) € £;„ = Ex...xE^ 

(n+i)times 

The elementary remdom variables X' , with p < n, represent the monomers 
in a directed chain X„ with a polymerization degree n. In this situation 
the function G„ on En reflects the intermolecular attraction or repulsive 
potential interactions between the monomers. The precise structme of these 
intermolecular interactions is usually unknown. We often need to represent 
these chemical reactions by a simplified model. It is often assumed that the 
“free chemical construction” of a directed polymer is Markovian. In other 
words, the monomer sequence is an .B-valued Markov chain. Under this 
Markovian hypothesis, random polymerizations with strong repulsions are 
closely related to nonintersecting Markov chains. Indeed, if we choose in 




26 



1 . Introduction 



(1.3) the potential functions 

^n(®0) • • • 1 ®n) ~ 

then the corresponding Feynman-Kac path measures represent the distri- 
butions of the path of a nonintersecting Markov chain 

= Law((Xo, VO < p < 9 < n) 

Note that the normalizing constants are well-defined as soon as for any 
n G N we have 



Z„ = Proba(X; ; VO < p < g < n) > 0 

More regular intermolecular interactions are represented by a Hamiltonitm 
energy function and a cooling temperature parameter T„. These models 
are again described by Boltzmann-Gibbs measures (1.3) with path potential 
functions 

G„(xo, . . . , Xn) = exp ^-^H„(xo, . . . , x„) 

Branching Population Models 

Fe}rnman-Kac measures are also related to genetic evolution models. This 
connection will become transparent when we describe in Section 1.5 the 
particle interpretations of Feynman-Kac distribution flows. In particular, 
we will see that the genealogical tree occupation measure associated with 
a “simple genetic” model converges as the size of the population tends to 
infinity to a Feynman-Kac path measure. Genetic models can be thought 
of as the evolution of a population on a given state space whose individuals 
reproduce or die subject to a natural evolution interaction mechanism. The 
members of the population are often called particles in reference to physics 
application model areas. 

These genetic processes differ from branching models in which the par- 
ticles do not interfere with one another such as the generalized Galton- 
Watson models developed in the book by T.E. Harris (176). For a more 
thorough study of branching measure-valued process from the point of view 
of the general theory of Markov processes, we recommend the reader to the 
book by E.B. Dynkin [128]. 

Our next objective is to better connect these two classes of branching 
models and underline their kinship with the Fe)mman-Kac models discussed 
in this book. 

We denote by 5 = Up>oF^ the state space of the branching population 
model whose members take values in some measurable space {E,E). The 
integer parameter p represents the size of the population, and for p = 0 
we use the convention E® = {c}, where c represents a cemetery or coffin 
state. We consider a sequence of Markov transitions M„(x,dp) on E and 




1.4 Motivating Examples 27 



a collection of N-valued random variables {gn{x))i>i,x^E with uniformly 
finite first and second moments. We also assmne that, for each x € E, 
(ffn(®))t>i are identically distributed, and we set E( 5 ^(*)) = G„(x). 

Our branching process (xn)n>o consists in a Markov chain taking values 
in S. Its initial state xo is assumed to be a single ^-valued random particle 
with a given distribution t)q € V{E), and its elementary transitions are 
defined in terms of a two-step branching/exploration mechanism: 

branching ^ exploration 

Xn e F’" ^ Xn G ^ Xn+l € 

During the branching stage, each individual Xn gi^ss birth to a random 
number of offsprings g'niXn), with 1 < i < Pn- At the end of the branching 
transition, we have a population x„ = (xi, • • • , xS") withpn = 5n(Xn) 
particles. Whenever the system dies at a given time n, we have (p„, Xn) = 
(0,c), and we set (pk,Xk) = (Pfc+i,Xfc+i) = (0,c) for any A: > n. Note 
that the reproduction ability of each individutd is not affected by the rest 
of the population. During the exploration transition, each particle evolves 
randomly according to the elementary Markov transition M„+i. 

By construction, we have p„+i = p„, and Xn+i = (Xn+i)i<»<Pn+i forms 
a sequence of independent random variables with respective distributions 



(Afn+l(xJl> •))l<t<Pn+l 
We introduce the point distributions 

^>(Xn) = S<J;^„ and «(Xn) = ^<5^„=^pi(xi) Vn 
i=i j=i i=i 

with the convention s(c) = = 0, the null measure on E. For any 

/ E Bb{E)j we easily check that 

E(s(x„+l)(/)|Xn) = f^gi{xi)Mn+l{f)ixi) 

i=i 

E(s(Xn+l)(/)IXn) = EGn(xi)A/„+l(/)(xi) = a(Xn)Qn+l(/) 

j=l 

with the bounded positive semigroup Qn+i on Bb{E) defined by 
Q„+i(/)(x) = G„(x)M„+i(/)(x) 

Using a simple induction, we check that the Feynman-Kac flow associated 
with the pair potential/transition (G„,M„) corresponds to the first mo- 
ments of the branching point distributions. That is, we have that 




28 



1. Introduction 



where represents an valued Markov duun with elementary transitions 
M„ and initial transition tjo = Law(xo)- This representation of birth and 
death models in terms of a point distribution is due to Moyal [251, 252] 
and it has been further developed by Harris [175, 176]. 

FVom previous considerations, we observe that an elementary branching 
approximation algorithm of the unnormalized Feynman-Kac flow could 
consist in sampling N independent branching models of the previous form. 

We finally mention that these simplified models are currently used in star 
tistical physics to study the transport and the multiplication of neutrons 
(see for instance [176] and references therein). They also have some inter- 
esting features of more sophisticated population evolution models arising 
in molecular biology [203]. For a more recent accoimt on branching pop- 
ulation models, we refer the reader to the books [14, 128, 180] and refer- 
ences therein. Related continuous time branching interpretation models of 
Feynman-Kac formulae can also be found in [96]. 



1.4-5 Applied Probability and Statistics 

There is a remarkable advantage in formulating a nonlinear estimation 
problem in terms of a Feynman-Kac path measure. FYom the theoreti- 
cs point of view, Feynman-Kac models represent a change of probability 
measures. The change of probability mass expresses in some way the rel- 
ative information between a complex distribution and a simpler reference 
one. The firustration practitioners may get when developing these abstract 
models will be released since the various particle interpretations of these 
measures will give instantly a collection of powerful simulation tools for the 
numerical solution of the nonlinear problem at hand. 

These Feynman-Kac modeling techniques offer nice and natural probar 
bilistic interpretations of Dirichlet problems with boimdary conditions (see 
Section 12.2.2 and Section 12.2.4) and of fixed points of non linear integral 
operators (see Chapter 5 and Section 12.4). They also appear in the se- 
quential anal}^is of probability test ratio (see Section 12.3.3) and in Monte 
Carlo Markov chain problems (see Chapters 5 and 6, and Section 11.2). To 
illustrate these comments, we give next one original way to describe the 
conditional distribution of a Markov chain restricted to its termintd value 
in terms of a Feynman-Kac path measure (see [80, 81] and Section 5.5). 

Let 7T and L be respectively a probability measure and a Markov kernel 
on a measmable state space (5,5). We associate with the pair (n,L) the 
5-valued Markov ch«dn (fl, F, Y, P^) with initial distribution tt and Markov 
transitions L. Let K be an auxiliary Markov kernel on 5 such that the pair 
measures on = (5 x 5) defined by 

{ir X K)i{d{y,y')) = ir{dy) K{y,dy') 

{it X L) 2 id{y,y')) = Tr{dy*) L(y',dy) 




1.5 Interacting Particle Systems 29 



are mutually absolutely continuous. We further require that the triplet 
(ir, K, L) be chosen so that the Radon-Nikodym derivative 

d{nxL)2 
d(ir X K)i 

is a bounded and strictly positive function on E. Using a time reversal tech- 
nique (see Section 5.5.3), we can prove that for any bounded measurable 
function /„ on we have 



Ei^(/„(r„,r„_i...,ro)|r„+i=y) 

In the display above, represents the expectations with respect to P^. 
Similarly, stands for the expectation with respect to the distribution 
fy of a (canonical) Markov chain with transitions K and starting at y. 

This description of a chain restricted to its terminal value can be ex- 
pressed as a Feymnan-Kac path measure (1.3) through a state-space en- 
largement and a clear time reversal. 

When the Markov transition L is sufficiently regular, we also notice that 
in some sense 

KiYo € dy'l y„+i = y) = ^(dy')^^^^(y) > ir{dy') 

n — > oo 

FVom this property, we will prove that the marginal Feynman-Kac dis- 
tribution flow converges to tt as the time parameter tends to infinity. In 
Section 5.5, we will also use these observations to construct a genealogical 
path particle simulation technique for sampling restricted Markov models. 
As the form of the potential function already indicates, we will see that the 
corresponding particle model behaves as a sequence of interacting Metropo- 
lis models. We already mention that these Fe}mman-Kac-Metropolis models 
have “better” asymptotic properties than a traditional Metropolis-Hastings 
model. 



1.5 Interacting Particle Systems 

From the pure mathematical point of view, particle methods can be viewed 
as a kind of stochastic linearization technique for solving nonlinear equar 
tions in distribution space. The idea is to associate to a given nonlinear 
dynamical structure a sequence of -B^-valued Markov processes such that 
the iV-empirical measures of the configurations converge as N -¥ oo to 




30 



1. Introduction 



the desired distribution. The parameter N represents the precision param- 
eter and the size of the s}rstems. The state components of the £^^-valued 
Markov process are called particles. 

The choice of the particle model is far from being unique. Loosely speak- 
ing, it depends on the “Markov interpretation” of the nonlinear equation 
at hand. The objective of this section is to design a general strategy to 
construct a class of particle approximation models for Feynman-Kac dis- 
tribution flows. Before entering into a precise description of these models, 
we provide hereafter a brief reminder on the dynamical structure of the 
discrete and continuous time Feynman-Kac flows. We first recall that these 
models are respectively defined by 

»7n(/) = 7n(/)/7n(l) and »?f(/) = 7t(/)/7t(l) 



with 

/ n-l 

7n(/) = E /(X„) 

V p=0 

and 

7t(/) = E (^f{Xt) exp I ) 

The discrete and continuous time stochastic processes Xn and Xt are non- 
homogeneous Markov processes taking values in a measurable space (J5, £). 
Xn is a Markov chain with elementary transitions M„, and Xt is a Markov 
process with infinitesimal generators Lt. The potential functions Gn and 
Vt are bounded 5-measurable functions on E such that the normalizing 
constants are well-defined. To simplify the presentation, we further assume 
that Gn is strictly positive. In Section 1.3, we have seen that these distri- 
butions satisfy respectively the nonlinear equations 

Vn+I = '9niVn)Mn+i and = Vt{Lt{f)) + Vtif {Vt - Vt{Vt))) 

with the Boltzmann-Gibbs transformation from the set of probability 
measures t] on E into itself defined by 

^niv){dx) = Gn{x) i]idx) 

1.5.1 Discrete Time Models 

In the discrete time situation, we observe that there exists a nonunique 
collection of Markov transitions Kn.rj indexed by the set of probability 
measures rj e V{E) €md the time parameters n 6 N and such that 



Vn-^l — Vn^^n-^-l.rjn 




1.5 Interacting Particle Systems 31 



For instance, we can choose 

^ = I iSn.ry^ dy)A/n+l(yj (1*13) 

Je 

with the collection of selection type transitions 



^n,i7n(^>^y) “* 6x{dy) + (1 €n ^n(^)) ^n{Vn){dy) 

In the display above the nonnegative parameters Cn > 0 are chosen such 
that €nGn < 1- The iV-particle model associated with a given collection of 
Markov transitions Kn^jj satisfying the compatibility condition 

riKn,r, = 

is an E^-valued Markov chain 




Its elementary transitions are defined as 



Prob. ({I'll e <i(x‘ , . . . , i")!?!"’) = n E,"., i ,« J, Ki""’ ■ 

»=1 



The initial system consists of N independent and identically dis- 
tributed random variables with common law rjo- For sufficiently regular 
transitions, we prove that for any time horizon n and as N oo 

1 ^ 
i=l 

The convergence above can be understood in various ways. These asymp- 
totic properties will be discussed in full detail in Chapter 7. Mimicking 
formula (1.6), we also construct an unbiased estimate for the unnormalized 
model and prove that as iV -> oo 

7n (•)=»?«(•) YlVpiGp) — > 7n(-)=^n(-) !!»??(<??) 

p=0 p=0 

To get an intuitive feel of the motion of the particle models associated with 
Feynman-Kac flows, we provide a brief description of the particle model 
associated with the choice of transitions deflned in (1.13). To clarify 
the presentation, we suppress the index parameter N and write and 
instead of and In this simplified notation, we first observe the 

selection transitions take the form 



= fn (.) (1 - 6„ Gni^i)) Ef=l %) (•) (1-14) 




32 



1. Introduction 



with the discrete Boltzmann-Gibbs distribution given by 






1 ^ 



i=i 



Gniii) C 



(1.15) 



FY'om this observation, we see that the two-step transitions of the nonlinear 
distribution flow 

^rjfi ^n+1 

Vn ^ ^n(^n) = VnSn.rjn ^ ^n+1 “ ^n(^n)-Wn+l 

are approximated by a two-step and J5^-valued Markov chain 

w selection ^ mr mutation 

^ e ^ $„+i € 

In other words, during the selection transition, each particle evolves ran- 
domly as according to the transition (1.14). With a probability 

CnG'n(^n)> location and we set otherwise, 

we select randomly a new particle ^ in the current configuration with the 
Boltzmann-Gibbs distribution (1.15) and we set During the mutar 

tion stage, each particle evolves randomly as according to the 

mutation transition M„+i. Note that the selection transitions -¥ can 
alternatively be described by the following acceptance/rejection mechanism 

C=5Un + (l-<?n)C 

where is a collection of conditionally independent Bernoulli random 
variables with respective distributions 

p(«/; = 1 kn) = 1 - p(s; = 0 1 $„) = e„G„(e;) (i.ie) 

If we let t’ be the first time we have = 0, then it is readily checked that 
for any n > 0 

/n-l 

P(r'>n) = E n^p^p(^p) 

\p=0 

and therefore 

Law(Cj,...,^; I t‘ >n) = Q„ 

More generally, on the intersection of events n’_i(T‘ > n), the path par- 
tides (^0) independent and identically distributed with 

common law Qn. Loosely speaking, up to the first interaction time, the 
interacting particle model produces independent samples according to the 
desired distribution. 

In Figure 1.3, we have presented the schematic selection/mutation tran- 
sitions of eight particles. The particles with low potential are killed, while 
the one with high potential duplicates into several offisprings. 




1.5 Interacting Particle Systems 33 



I State space 




Time axis 

1 

# ^particle potential 
N=8 particles 



FIGURE 1.3. Genetic particle model 

In the time homogeneous situation and under appropriate regularity con- 
ditions these particle models also provide a natural particle approximation 
of the fixed point (whenever it exists) of the nonlinear transformations 

H e V{E) — > $(/i) = 9{n)M £ V{E) 

We have in some sense, and as n and N tend to infinity 

Vn ~^Voo = ^Voo) 

We will make precise this convergence with several uniform Lp and strong- 
propagation-of-chaos estimates. In this connection, we mention that the 
increasing propagation-of-chaos properties developed in this book also pro- 
vide precise informations on the asymptotic behavior of the invariant mea- 
sure of the interacting process as the size of the systems tends to infinity. 

If we interpret the selection transition as a birth and death process, then 
arises the important notion of the ancestral line of a current individual. 
More precisely, when a particle evolves to a new location 

we can interpret as the parent of Looking backwards in time and 
recalling that the particle has selected a site in the configuration 
at time (n - 1), we can interpret this site as the parent of Md 
therefore as the ancestor at level (n- 1) of Rimning back in time 
we trace mentally the whole ancestral line 

C.n = e; 

of each current individual. More interestingly, the occupation measures of 
the corresponding iV-genealogical tree model converge as IV -> oo to the 




34 



1. Introduction 




Feymnan-Kac path measures. In a sense to be given, we have the conver- 
gence, as N oo, 



1 



i=l 

Figure 1.4 represents the genealogical tree model associated with ten inter- 
acting particles. The initial probability density gives an idea of the initial 
configuration of the system, while the concentration of current individuals 
provides an approximation of the ciurent probability density. Note that the 
ancestors concentration has to be related to the marginal at time 0 of Qn. 

We finally recall that the Fe)rnman-Kac measures Q„ have the same 
dynamic€d structure as their end time marginals t}„. F^om this stability 
property, we will see that the genealogical tree-based model is nothing but 
the N-particle model associated with a nonlinear distribution flow in path 
space. 

1.5.2 Continuous Time Models 

Using the same line of argument as in the discrete time situation, the 
continuous time evolution equation (1.8) can be rewritten in the form 

where Lt,,, is a nonunique collection of infinitesimal generators satisfying 
the compatibility condition 

Tk{Lt,r,m = VtiLtif)) + VtifiVt - THiVt))) (1.17) 

To construct simple examples of generators satisfying this condition, we 
proceed as on page 24. We let and (-Vj”) be the positive and negative 
parts of the potential function Vt and we denote by L~ and L~ the jump 




1.5 Interacting Particle Systems 35 



type generators 

Knif)i^) = j if {y)-f{x))v{dy) 

= J ifiy) - fi^)) v{dy) 

Arguing as before, we check that the class of generators given by 

^ ^ ^ 

l>t,r\ = it + it,»} with Lt,n = it.Tj + it,f> 

satisfies the desired compatibility condition (1.17). Note that we can alter- 
natively use the potential functions ^ ) associated with the posi- 

tive/negative part decomposition 



The iV-particle model associated with a compatible class of generators 
Lt,r, is a nonhomogeneous Markov process 



AN) _ (AN,l) An,2) 
St “ 1st >St ' 






taking values in E^. Its infinitesimal generator is defined for a suffi- 
ciently regular test function (p on by the formula 



N 






> = E Er=. • • • ’ • • • ' *"") 

t=i 



We have used the notation instead of Lt.i, when the generator acts on 
the tth component of the test function. 

Loosely speaking, the motion of the 7V-particle model associated with the 
previous example is decomposed into three mechanisms. As in the discrete 
time case, we simplify the presentation, suppressing the index parameter 
N, and we write instead of and 

Between the interacting jumps, each particle Q evolves randomly accord- 
ing to an I>t-motion. During its exploration of the state space, it is killed at 
rate and instantly a randomly chosen particle in the configuration 

splits into two ofisprings. In addition, at rate ri^{Vf~ ’) = (Ct). 

a randomly chosen particle is replaced by a new one randomly chosen 
with a probability proportional to its adaptation The initial sys- 

tem ^0 = (^)i<t<N consists of N independent and identically distributed 
random particles with common law 

In Figure 1.5, we have presented the schematic evolution of five particles. 
Note that, in contrast to the discrete time case, each particle interacts and 
jumps to a new selected site at random exponential times. 




36 1. Introduction 




FIGURE 1.5. Genetic particle model (continuous time) 

For sufficiently regular generators Lt.ij, we prove that, for any fixed time 
horizon t, as N oo and in a sense to be given 

t=l 

As for discrete time models mimicking formula (1.6), we construct an un- 
biased estimate for the unnormalized distribution flow. We prove that as 
N tends to infinity 

7r(-) = »?f(-) exp f ri^{Vs)ds —4 'yt{.) = rh{.) exp f Vs{Vs)ds 
Jo Jo 

As in the discrete time situation for suflSciently regular and time homoge- 
neous Feynman-Kac models, the particle approximation measmes rj^ also 
provide a natural particle approximation of the fixed point rjoo (whenever 
it exists) of the integro-differential equation (1.17) and we have in some 
sense and as t and N tend to infinity 

Voo 

We can also interpret interacting jiunps as birth and death mechanisms 
and define in this way the ancestral lines of each current individual 

f = Using the structural stability properties of Feynman-Kac flows, 
we also prove that the genealogical tree particle model is nothing but the 
7V-particle model associated with the path-distribution flows Qf We use 
this observation to check that the occupation measures of the correspond- 
ing IV-genealogic6d model converge as iV -)■ oo to the Feynman-Kac path 
measures. That is, in a sense to be given, we have the convergence, as N 
tends to infinity, 

1 ^ 

»=1 




1.6 Sequential Monte Carlo Methodology 37 



1.6 Sequential Monte Carlo Methodology 

One central question arising in scientific computing and applied mathemat- 
ics is to find munerical approximations of a given nonnegative measure tt on 
some measurable space {E,€). In other words, the objective is to evaluate 
for any bounded measurable function / the integrals ir{f) = / 7r(dx) /(x) 
or more generally the value F{n) of some functional F : V{E) -> R. One 
important and rather generic class of measmes arising in various scientific 
disciplines is the Boltzmann-Gibbs distributions defined by 

^{ri){dx) = G(x) 7j(dx) (1.18) 

where G represents a potential function on some measurable space {E, £) 
and rj a given (probability) measmre on E such that r]{G) > 0. We first 
notice that any Feynman-Kac distribution and therefore all the examples 
we have presented so far are particular Boltzmann-Gibbs measures (for 
more details we refer the reader to Section 11.9 and to Exercise 11.9.1). 
In physics, the complex interaction between atoms in ferromagnetic spin 
models is often expressed in terms of Boltzmann-Gibbs measures (see Sec- 
tion 11.2). In biology, these distributions are {dso used as a simplified model 
for DNA sequences and protein chains [215, 227, 234, 256]. In engineering 
science, optimization and regulation problems consist in computing the 
global minima of a given numerical cost function K on a given state space 
E. For instance, the cost function can measure the performance of a given 
quality test schedule performed at each successive stage of an industrial 
production chain. In some image reconstruction problems, the function V 
measiures the resemblance of a brightness pattern of pixel configurations 
to a given reference image [18, 113, 154]. In both cases, the desired min- 
ima can be interpreted as the modes of a Boltzmann-Gibbs measure with 
potential function G = e~^. 

More generally, in the Bayesian perspective, “any estimation problem” 
can be interpreted as a Boltzmann-Gibbs measure of the form (1.18). In 
this context, the reference measure u and the function G are the so-called 
prior/posterior distributions. These Bayesian models are often described 
by a pair (X, Y) of random (d -I- d')-dimensional random variables having a 
joint density probability of the form pxy{x,y) = Pv'|x(l/I®) Px{x), where 
px is the marginal of px,y with respect to the first component X, and 
py |x (. (i) is the conditional density of Y given X = x. The random variable 
Y represents the partial and noisy observation of some random state X. 
To estimate X given the observation Y = y, we need to compute the 
conditional density 



Px\Y{x\y) = 



Py|x(y|j) 

Priy) 



Px{x) 




38 1. Introduction 



with the normalizing constant py(y) = / PY\x(,y\x)px{x)dx. If we fix the 
observation data Y = y and if we set v{dx) = px(®) dx emd G(x) = 
PY\x(y\x), then these conditional distributions coincide with the Boltzmann- 
Gibte distributions defined in (1.18). In this context, we also notice that 
the normalizing constant py{y) coincides with the corresponding parti- 
tion function. We emphasize that the Bayesian methodology is often more 
directly related to nonlinear filtering and Feynman-Kac models. In this 
interpretation, it is often more judicious to take advantage of the dynam- 
ical or the multiplicative structure of the desired conditional distributions 
to define a working interacting particle algorithm. In statistical literature, 
these filtering models are rather called hidden Markov chain models. This 
situation is often related to distribution densities px,y = P0,x,y, whose 
values depend on some unknown random variable 6. lYaditionally, 6 rep- 
resents the unknown distribution of X and we want to estimate the latter 
using the observations of the state sequence X. If we set = {0,X), we 
see that this situation reduces to the first one by a simple state-space en- 
largement. We refer the reader to any textbook on hidden Markov chain 
and Bayesian statistics (see for instance [132, 199, 206, 223, 275, 237] and 
references therein). 

The idea of classical Monte Carlo Markov chain methods (often abbre- 
viated MCMC) is to find a judicious Markov kernel K,, with invariant 
measure ri = rjK,,. When the Markov chain with elementary transition 
kernel K,f is sufficiently mixing, one expects after some long and spaced 
runs to generate approximate samples of the limiting distribution ’^(r/). 
There exist several popular algorithms, including the Gibbs-Sampler and 
the Metropolis-Bastings model. It is of course out of the scope of this book 
to review the modeling, the convergence, and the applications of these 
models. For more details, we refer the reader to Chapter 5 and references 
therein. We imderline that these “linear type” algorithms suffer from two 
main numerical problems. First of all, their convergence to the desired equi- 
librium Boltzmann-Gibbs measure strongly depends on the oscillations of 
the potential function G. In addition, they are not recursive with respect 
to the time parameter. For instance, in every filtering problem, the target 
distribution is the conditional distribution of a signal process given the 
observations delivered at each time by some sensor. In this context, it is 
hopeless to run a different Markov chain with transition between each two 
consecutive observations. More generally, suppose we are given a collection 
of target distributions r}n on some measurable spaces {En,Sn)- Further as- 
sume that they satisfy a recursive equation of the form 

Vn+l — Vnf^n+l,ri„ 

for some collection of Markov transitions Kn+i,ri from some measurable 
space {En,€n) into (E„+i,5„+i). Here again, to approximate these measure- 
valued equations or generate samples according to each t}„, it is not really 
judicious to run a different Markov chain model with invariant distribution 




1.7 Particle Interpretations 39 



t}n. The advantages in applying the interacting particle interpretations of 
these nonlinear models are twofold. First, they are recursive with respect to 
the time parameter. On the other hand, even when the target distribution 
Ti is not related to some of these recursions, we can use the time reversal 
Feynman-Kac representation (1.12) presented in Section 1.4.5 and its par- 
ticle approximation model to improve the convergence to equilibrium of 
the algorithm. 

For various statistical engineers or physicists, any Monte Carlo Markov 
chain algorithm is pinched up between a Metropolis-Hastings model and 
a Gibbs sampler. This rather simplistic idea is partly true if one believes 
that any kind of stochastic algorithm is a simple Markov chain. We em- 
phasize here that evolutionary type models such as genetic type branch- 
ing algorithms and the Feynman-Kac-Metropolis algorithms described in 
Section 5.4 are nonlinear type models and do not fit in the same frame- 
work as MCMC methods. These modern strategies rather belong to mean 
field interacting particle systems and nonlinear measure-valued processes. 
In connection with the traditional Metropolis acceptance/rejection transi- 
tion, the idea here is simply to duplicate better-fitted proposals. Loosely 
speaking, in these branching type algorithms, the time precision parameter 
of traditional MCMC methods is replaced by a population size parameter. 
Various authors have tried to find the exact origins of this natural splitting 
idea (some recent attempts can be found in [125, 139, 157, 167, 227]). In 
our opinion, these evolutionary techniques of duplicating better-fitted in- 
dividuals are rather natural and arise in so many human endeavors that it 
is more interesting to trace back in time their use in different application 
areas. 



1.7 Particle Interpretations 

The particle models described above can be sought in many different ways, 
depending on the Feynman-Kac application model areas we consider. Apart 
from the pure mathematical interpretation as a stochastic linearization 
technique, three additional interpretations can be underlined. 

From the point of view of physics, statistical mechanics, biology, and 
industrial chemistry, these particle models provide a concrete microscopic 
interpretation of the evolution of some physical quantity. For instance, in 
the trapping problems examined in Section 1.4.3, the particle model may 
represent the evolution of a collection of particles in an absorbing medium. 
When a particle enters an obstacle and dies, a particle with better adap- 
tation splits into two offsprings. These interacting jump mechanisms are 
also related to the microstatistical interpretation of Schrodinger equations 
discussed in Section 1.4.3. In this context, the particle model can be seen 
as the evolution of relativistic quantum particles with creation and kilUng 




40 1. Introduction 



transitions (see also [254]). We also mention that the continuous time 
Feynman-Kac model and its particle approximation model can be seen as 
a simple generalized Boltzmann equation (as defined in [245]) with the 
corresponding Nanbu particle interpretation. In this situation, the parti- 
cle model represents the evolution of colliding molecules in a rarefied gas 
model. In this context, genealogical models represent the historical pro- 
cess of splitting particles in trapping models or the history of collisions 
in rarefied gas equations. The genetic type particle models introduced in 
Section 1.5.1 and Section 1.5.2 are clearly related to natural evolution mod- 
els. They can be used to model random mutation/selection transitions in 
gene tmalysis. In this connection, the genealogical models can be seen as 
the random evolution of the ancestral lines of species. In pol 3 nner analysis, 
the Feymnan-Kac path models represent the Boltzmann-Gibbs measures 
of a polymerization sequence (see Section 1.4.4 and Section 12.5). In this 
context, the particle motion can be regarded as the chemical construction 
of monomers with intermolecular interaction in a given solvent. The cor- 
responding genealogical models represent the chemical construction of a 
sequence of flexible and directed polymers. 

FVom the point of view of evolutionary mathematics and engineering sci- 
ences, these probabilistic methods can be viewed as global and adaptive 
stochastic searches of a state space. The motion and the adaptation of 
individuals are related to natural evolution mechanisms such as gene mu- 
tations and/or t}rpe selection. This interpretation also enters physical and 
biological intuitions in engineering problems. We can relate in this way 
learning and adaptation mechanisms with evolutionary and microscopic 
particle motions. In this connection, particle search methods can also be 
regarded as a stochastic grid approximation. They also complement more 
traditional deterministic grid techniques. Their main advantage is to refine 
the precision of the grid in accordance with the mass variation dictated 
by the distribution equation. In this interpretation, genealogical models 
represent the adaptation history of the grid. 

FVom a statistical point of view, the genealogical and interacting psuticle 
models presented in this book can also be regarded as a new approximation 
simulation technique for sampling conditional distributions and more gen- 
erally for sampling according to Boltzmann-Gibbs tjrpe measures on path 
spaces. They complement classical Monte Carlo strategies such as the Gibbs 
sampling and the Metropolis-Bastings algorithm. In contrast to the latter, 
particle simulation methods are recursive with respect to the time parame- 
ter. One of the most illuminating examples that illustrates this point of view 
is the Feynman-Kac-MetropoUs interpretation of a restricted Markov chain 
model described in the introductory Section 1.4.5 and further developed in 
Section 5.5. The genealogical approximation models provide a natmal sim- 
ulation technique for drawing path samples according to the distribution 
of a Markov chain restricted to its terminal values. Furthermore, the im- 
derlying genetic type model can be interpreted as a sequence of interacting 




1.8 A Contents Guide for the Reader 41 



Metropolis models. These particle models are not of pure mathematical 
interest. We will see that they have “better asymptotic properties” than 
traditional Monte Carlo Markov chain methods. 

During the last few decades, the application of particle methods has 
grown, establishing unexpected coimections with a number of other fields, 
including biology [227, 231, 232, 233, 234, 235, 280], electromagnetics [191, 
299], neural networks [151], computer networks and telecommunication 
analysis [274], traffic control [5], image processing [297], financial math- 
ematics [23, 285, 302], globed optimization problems [108, 160, 161, 162, 
192, 193, 205, 306], statistics [8, 9, 124], and signal processing [7, 10, 122, 
123, 143]. 

In each disciphne, these particle methods have taken different names 
but often have a common theoretical basis. For instance, in applied prob- 
ability and engineering literature, there exists at least a dozen different 
names for the same particle algorithm: branching and interacting parti- 
cle systems [75, 97, 156], Monte Carlo optimal filters [39, 185, 204], ge- 
netic algorithms [181, 182], bootstrap or particle filters [123, 160, 239, 274], 
sampling-importance-resampling [57, 164], condensation filters [144], po]>- 
ulation Monte Carlo algorithms and Sequential Monte Carlo methods [51, 
81, 122, 125, 207, 228], spawning filters [145], switching algorithm [224], 
sampled stochastic processes [238], auxiliary particle filters [269], matrix re- 
configurations [179], quantum and diffusion Monte Carlo methods [12, 242], 
go with the wiimer [4, 263, 264], multisplitting and restart method [49, 261] , 
interacting Metropolis algorithm [80], and ensemble Kalman filters [135]. 

All of these particle models are built on the same paradigm: When explor- 
ing a state space with many particles, we duplicate better fitted individuals 
at the expense of having light particles with poor fitness die. In this connec- 
tion, we also mention that this selection mechanism has been associated in 
biology and engineering science with the following list of botanical names, 
to name a few: branching selections, bootstrap, adaptive d}mamics, switch- 
ing, prune enrichments, cloning, reconfiguration, stratification, resampling, 
rejuvenation, acceptance/rejection, spawning, and “go with the winner”. 



1.8 A Contents Guide for the Reader 

The book is divided into eight main parts devoted respectively to Markov 
path processes, to the precise description of the Feynm«m-Kac models and 
their structural properties, to genealogical and particle models, to stability 
properties of Feynman-Kac semigroups, to invariant measures, to annealed 
properties, to the asymptotic behavior of particle models, 6md to appUcar 
tions in statistics, physics, and engineering sciences. 




42 



1. Introduction 



In Chapter 2, we give a detailed and rigorous mathematical description 
of Feynman-Kac models with an emphasis on path-valued processes and 
unnormalized models. To make the book self-contained, we have provided 
in the introductory Section 2.2 a brief discussion on Markov chains. We 
give three different presentations arising in engineering science, applied 
probability, and Bayesian statistical literature. We also discuss the coor- 
dinate method of constructing a Markov chain on a canonical probability 
space. We conclude with the construction of path-space Markov models. 
Then we list three important structural stability properties connected re- 
spectively to path/marginal models, change of reference probability, and 
updated/prediction flow models. These properties show the potential of 
Feynman-Kac modeling techniques. They exhibit several degrees of free- 
dom in the Feynman-Kac interpretations of a given nonlinear estimation 
problem. To illustrate the impact of these results, we already mention that 
the structural stability property related to path and marginal models will 
lead to a natural construction of genealogical particle models. After this 
round of modeling and structural analysis, we propose two probabilistic 
interpretations of Feynman-Kac distribution flow models. The first one 
concerns the traditional particle-trapping interpretation arising in physics 
literature. The second one is related to interacting and measure-valued 
processes theory. We provide a collection of mean field and McKean type 
interpretations of these flows. In a third part, we discuss Feynman-Kac 
models in random media. We connect the annealed and quenched flows in 
terms of a Feymnan-Kac model in distribution space. The impact of this 
modeling technique will be illustrated in nonlinear filtering problems with 
the construction of interacting Kalman-Bucy filters. Finally, we describe 
the semigroup structme of discrete time Feynman-Kac models. These up- 
dated and prediction semigroups will be of constant use in the development 
of this book. 

Chapter 3 is concerned with genealogical and interacting particle mod- 
els. It is itself essentially decomposed into three main parts. We first give a 
rigorous mathematical construction of the particle model associated with 
a McKean interpretation of a nonlinear and measure-valued process. This 
particle method is presented here as a stochastic linearization technique of 
a nonlinear equation in distribution space. Then we illustrate this abstract 
formulation with a detailed discussion on the particle interpretations of 
Feynman-Kac models. We also show that genealogical particle tree models 
are natural particle interpretations of Feynman-Kac models in path space. 
Finally, to prepare the applications developed in the next part of the book, 
we review the various particle measures introduced so far with some mo- 
tivating convergence results. We also end this part with a brief discussion 
on the main regularity conditions used in this book. 

Chapters 4, 5, and 6 are devoted to qualitative properties of Feynman- 
Kac semigroups. Chapter 4 focuses on the stability of Feynman-Kac semi- 
groups. In a first part, we analyze the contraction properties of a general 




1.8 A Contents Guide for the Reader 43 



Markov kernel with respect to an abstract h-relative entropy. We exhibit 
several functional inequalities in terms of the Dobrushin ergodic coefficient. 
Then we use these results to study the stability properties of nonlinear 
Feynman-Kac semigroups. We derive precise contraction estimates with 
respect to a class of /i-relative entropies, including the traditional Boltz- 
mann entropy or the total variation distance as well as the L 2 or Havrdar 
Cheirvat entropies and the Hellinger integrals. We finally apply these results 
to study the stability properties of several classes of models including up- 
dated Feynman-Kac semigroups and stochastic models arising in filtering 
problems. 

In Chapter 5, we discuss the existence and the uniqueness of invariant 
measures. We also design a class of Metropolis type Feynman-Kac models 
admitting a given distribution as an invariant measure. We coimect this 
model with restricted Markov chains with respect to their terminal values. 
We compare the path-particle algorithms associated with these Feynman- 
Kac models with the traditional Metropolis Markov chain. Finally we give a 
detailed study on the stability properties of this particular class of models. 

In Chapter 6, we examine the long time behavior of a Feynman-Kac 
distribution flow model associated with a cooling schedule. We first extimine 
the annealed properties of the Feynman-Kac-Metropolis model introduced 
in the preceding chapter. We will see that this model can also be regarded 
as a nonlinear simulated annealing algorithm. We also discuss the annealed 
properties of a class of Feynman-Kac models arising in trapping analysis. 
Specid attention is paid to the limiting concentration regions. We propose a 
set of sufficient conditions imder which the algorithm converges as the time 
parameter tends to infinity to the global minima of a potential function. 

Chapters 7, 8, 9, and 10 are concerned with the asymptotic behavior of 
particle methods when the size of the systems tends to infinity. We have 
chosen to present these convergence results with respect to the traditional 
increasing degrees of refinement. A good deal of the theory consists of the 
study of various limit theorems. We provide a detailed analysis going firom 
the simple and traditional law of large numbers to more sophisticated em- 
pirical process theorems, uniform estimates with respect to the time param- 
eter, weak and strong propagation of chaos, central limit theorems, Berry- 
Esseen inequalities, Donsker type theorems, and large-deviation principles. 
For the convenience of the reader, we have tried to present an “expose” 
of the mathematical theory we shall be using on each of these different 
subjects in a self-contained treatment. Each chapter often starts with a 
preliminary discussion on some more or less traditional results. These in- 
troductory sections also provide several new and complementary results to 
some well-known theorems such as a Berry-Esseen inequality for martin- 
gale sequences as well as a complement to the Laplace- Varadhan integral 
tremsfer lemma in the context of large deviations. 

Chapters 11 and 12 are concerned with the applications of the Feynman- 
Kac modeling and the particle methodology developed in this book. Chap- 




44 



1. Introduction 



ter 11 should not be skipped. It provides a series of Feymnan-Kac and 
interacting particle recipes to be used in different application model areas. 
In Chapter 12, we describe in some detml applications to restricted Markov 
chain sampling, spectral analysis of Feynman-Kac-Schrodinger semigroups, 
fixed point approximations of nonlinear equations in distribution space, rare 
events estimation, Dirichlet problems with boundary conditions, fiexible 
polymer chains, filtering, and path estimations of a signal. The emphasis is 
given to the Feynman-Kac modeling of the various mathematical quantities 
we want to estimate in each of these disciplines: 

• In statistical simulation problems, we provide a Fe)rnman-Kac formu- 
lation of the laws of restricted Markov models. First we examine a 
class of Markov chains restricted to a given space-time tube. In an- 
other context, we construct a Feynman-Kac semigroup admitting a 
given distribution as an invariant measure. This construction is based 
on a judicious choice of Metropolis type potential function and on a 
time reversal technique. In this situation the resulting particle model 
will behave as a sequence of interacting Metropolis algorithms. Fur- 
thermore, the corresponding genealogical tree-based model can be 
regarded as a path particle simulation technique for drawing sam- 
ples according to the law of Markov path models restricted to their 
terminal values. 

• In the spectral analjrsis of Feynman-Kac-Schrodinger semigroups, we 
describe the Lyapunov exponent and related spectral quantities in 
terms of the invariant measure of a Feynman-Kac model. The particle 
models associated with these exponents provide a natural microscopic 
particle interpretation of these physical quantities. For instance, in 
trapping analysis, these exponents represent the cost of performing 
long crossing in an absorbing medium without being absorbed. In this 
context, the particle coefficients represent the mean averaged energy 
of a flow of interacting particle systems. 

• In rare events analysis, we propose a multilevel Feynman-Kac path 
model to describe the conditional distribution of a stochastic mo- 
tion in a rare event regime. The multilevel decomposition can be 
interpreted as the different successive steps the process needs to pass 
to enter into the upper rare level. In this context, the genealogical 
particle model can be seen as the random tree of different possible 
trajectories leading to this level. 

• In biology and chemistry, flexible polymer chains are often described 
in terms of a Boltzmann-Gibbs distribution. We connect this formu- 
lation with Fe)rnman-Kac models on path spaces. In this connection 
each random path can be seen as a sequence of reacting monomers 
in a given solvent. The potential function represents the attractive or 




1.8 A Contents Guide for the Reader 45 



repulsive interaction between the elementary monomers of a polymer 
chain. As an aside, we mention that self-avoiding random walks are 
simple polymerization models with repulsive interactions. The path 
particle algorithms associated with these models can be interpreted 
as a particle simulation method for sampling flexible polymer chains. 

• In signal processing, we underline the role of Feynman-Kac path mear 
sures in the description of path estimation and nonlinear smoothing 
problems. We discuss four different ways to introduce a nonlinear fil- 
tering model: the engineering description in terms of a signal/sensor 
equation, the probabilistic interpretation with the pair (signal, obser- 
vation) Markov model, the traditional change of reference probability 
measure technique, and the Bayesian posterior/prior interpretation. 
In this context, we propose a new model of quenched and annealed 
formulae to describe a peui;ial linear/Gaussian filtering problem. As 
mentioned above, the particle interpretation of the annealed model 
consists of a sequence of interacting Kalmam-Bucy filters. 




2 

Feynman-Kac Formulae 



2.1 Introduction 

In this chapter, we introduce the mathematical structure of Feynman-Kac 
models. 

In a preliminary section, Section 2.2, we provide a brief introduction to 
Markov chain models in abstract path spaces. 

In Section 2.3, we present the abstract construction of Feynman-Kac 
distribution models on general nonhomogeneous state spaces. We also fix 
some notation and terminology currently used in this book. 

Section 2.4 provides some structural properties of Feynman-Kac models. 
We connect path-space models with their time marginals as well as con- 
necting updated models with prediction ones. We also examine the stability 
of Feynman-Kac formulae through a change of reference probability mea- 
sure. The stability properties developed in this section provide a natural 
and simple tool to transfer results from marginal or prediction models to 
path-space models or updated ones. 

Section 2.5 is concerned with two physical interpretations of Feynman- 
Kac distribution flow models. The first one is the traditional interpretation 
in terms of a single particle motion in an absorbing medium. The second 
interpretation comes from nonlinear and measure-valued processes theory. 
We provide an alternative physical interpretation of Feynman-Kac models 
in terms of interacting jump type Markov models. 

In Section 2.6, we discuss Feynman-Kac models in random media. We 
provide an original description of the annealed distribution flows in terms 




48 2. Feynman-Kac Formulae 



of a Feymnan-Kac model associated with a measure-valued Markov chain. 
These modeling techniques will be used in Section 12.6.7 to define in a nat- 
ural way a sequence of interacting Kalman-Bucy filters. To better connect 
our work with related literature on the subject, we mention that continuous 
time Feynman-Kac formulae with random potentials have been used in a 
different context as explicit examples of PDEs in a random environment. In 
discrete space, Carmona and Molchanov gave in [40] a detmled treatment 
of the so-called parabolic Anderson model. Large time asymptotics and 
related Lyapunov exponent estimations for such stochastic Feynman-Kac 
flows can aim be found in [13, 29, 31, 41, 42, 152]. 

Section 2.7 is concerned with the fine semigroup structure of Feynman- 
Kac distribution flows. 

Section 2.4 and Section 2.6 can be skipped at a first reading. They may be 
used from time to time as a reference in some precise situations. Sections 2.5 
and 2.7 are the core of this chapter. They provide a detailed discussion on 
the Feynman-Kac models cmrently used in the further development of this 
book. 



2.2 An Introduction to Markov Chains 

A Markov chain is a sequence of random variables Xn with a time in- 
dex n 6 N and taking values at each time n in some measurable state 
space (EnySn)- In contrast to any class of random sequences the future of 
a Markov evolution model is independent of the past when the present is 
given. Loosely speaking, a Markov model is a random and discrete gener- 
ation model whose elementary transitions only depend on the more recent 
information on its past history. 

To motivate this abstract Markov dependence property and to introduce 
the various application areas we have ahead, we have chosen to present 
next three traditional formulations. 

The first one can be regarded as a stochastic version of a control system, 
the random component of these models usually consists of a sequence of 
independent random variables [/n, n € N, taking values in some measurable 
control space {Cn,Cn)> Formally, they are defined inductively by a recursive 
equation of the form 



Xn=Fn(Xn-.uUn) ( 2 . 1 ) 

The drift function Fn is a given measurable function from Fn~i x Cn 
into Enj and Xq is an arbitrary random initial condition with distribu- 
tion /i € F(Fo). The probabilistic interpretation of the random sequence 
Un depends on the problem at hand. For more details, we refer the reader 
to the discussion on filtering models given in Section 1.4.1. 




2.2 An Introduction to Markov Chains 49 



Another more probabilistic way to specify a Markov chain is to consider 
its elementary transitions 

Proba(An € dXn \ l ~ ^n— l) ~ ■Mr»(Xn— l,dXn) 

The Markov kernel M„(x„_i,dx„) denotes the probability that the chain 
X„_i = x„_i € En-i at time (n - 1) will be at time n “in an infinitesimal 
neighborhood” dx„ of the point Xn G To coimect this formulation 
with the previous one, we observe that the Markov kernel associated with 
the Markov chain (2.1) is simply given by 

1) dXfi) — Proba(f^ ^(in— ^ din) 

FVom the practitioner’s point of view, there may be very little difference 
between these two formulations. The first interpretation is often used in 
engineering science when the Markov chain represents the evolution of a 
physical quantity with a precise structure. 

The second formulation is more tractable when the Markov model is 
defined by complicated rules with no precise dynamical structure. 

Finally, in Bayesian literature, the elementary transitions of a chain are 
often described in terms of a hypothetical density function 

Proba(X„ € di„ | A„_i = i„_i) = p„(x„|i„_i) di„ 

Some authors also use the synthetic notation 

An I An— 1 — Xn— 1 ~ Pn(^n|Xn— l) dXn 

The term Pn(xn|xn-i) represents the density of the Markov transition with 
respect to some reference probability measure dXn- 

2.2.1 Canonical Probability Spaces 

The rigorous mathematical formulation in terms of Markov kernels is tradi- 
tionally used in applied probability literature. Aside firom inherent mathe- 
matical interest, this measure-theoretic formulation also provides a natural 
semigroup formulation for the asymptotic analysis of Markov chains. Fur- 
thermore, the forthcoming Markov chain models are traditionally described 
on a canonical probability space, and their “dynamical structure” only de- 
pends on the underlying probability distribution. Loosely speaking, this 
presentation is often less “notationally consuming”: 

The same canonical process may have different probabilistic interpreta- 
tions depending on the different probability distributions we consider on 
the canonical space. 




50 2. Feynman-Kac Formulae 



For all these reasons, we wiU use as often as we can this modem prob- 
abilistic and canonical formulation in the modeling and the analysis of 
Feyiunan-Kac semigroups. 

Since one of the goals of this section is to guide the reader to abstract 
Markov chains on general nonhomogeneous state spaces, we give next a 
brief presentation of the rigorous mathematical construction of a canonical 
Markov chain. This construction is sometimes called the coordinate method 
of constructing the Markov chain. These models will be of constant use in 
the further development of this chapter. 

To start with, we use the Markov dependence property to check that 

Proba((Xo, . . . , X„) e d(xo, . . . , x„)) 

Afjj(xn— 1 , dx,i)Proba((.^o, . . . , i) ^ d(xo, . . . , x)) 

= /i(dxo) Mi(xo,dxi) ... M„(x„_i,dx„) 

FVom this display, we define the distribution of the sequence 

(Xo,...,X„) 

on fl„ = E[o,„](= rip=o^p) equipped with the product a-algebra = 
IIp=o setting in the integral sense 

Pp,n(d(xo, . . . , Xji)) = fi(dXo) Mi^XOfdXi) ... A/n(Xn— l,dx,i) 

By the consistency of the collection P^,n, n € N, there exists am overall 
distribution P^ on fl = nn>o^" which P^,„ are finite-dimensional 
distributions (see lonescu Tlilcea’s theorem, p. 249 in [287)). That is, for 
any An € €nt n > 0, and any cylinder set of the form 

C„(^o, . . . , i4„) = {w = (u;„)„>o € n ; VO < p < n WpeAp) 
we have that 



Pp(C'n(''4o, . • . — Pp,n(.'4o X ... X 

If we denote by Xn, n € N, the sequence of canonical mappings 

X.n '• ^ ~ (^n)n>0 ^ ^ ^ ^ En 

then we find that 



Cn{Ao, . . • , i4n) — {w = (Wn)n>0 G ; VO < p < Tl Xp{w) € Ap} 
This readily yields that 



P^((Xo,...,Xo) 6 (i4i X ... xyl„)) 




p(dxo) Mi(xo,dxi) ... Mn 

xAn 



(^n— 1? 



dXfi) 




2.2 An Introduction to Markov Chains 51 



The canonical probability space defined in this way 

^ ^ n)n€N) ^ ~ (-^n)n€N) ^ 

n>0 

is called the canonical Markov chain with transitions Mn and initial distri- 
bution /i. The probability measure on the canonical space (fl, T, Tooi X), 
with Too = Vn>o«^n> Is Called the distribution or the law of the (canonical) 
Markov chain. 



2,2.2 Path- Space Markov Models 



The abstract mathematical modeling presented in the preceding section is 
particularly useful to describe Markov motions on path spaces. We now 
discuss some of these models. Let be an auxiliary collection of 

measmrable spaces, and let be a nonanticipative sequence of £^^-valued 
random variables in the sense that the distribution of X[^^i on only 
depends on the random states (Xq, . . . , X^). By direct inspection, we notice 
that under some appropriate measurability conditions the path sequence 

= = ( 2 . 2 ) 

forms a nonhomogeneous Markov chain taking values in the product spaces 

= -Elo.n) = {E'qX ...X E'J 

In this situation, each point x„ = (xq, . . € E„ has to be thought of 

as a path from the origin up to time n. 

The archetype of such Markov path models is the situation where X'„ is 
an JS^-valued Markov chain with not necessarily homogeneous transitions 
M'^{xn-i,dxn) from into E'^- In this context, the Markov transitions 
Mn of the path chain (2.2) and are connected by the formula 



((^Oi • • • » ^n)» ^(l/O) • • • 1 2/ni J/n+1 )) 



= <J(*o,...,x„)(%o, . • -,yn)) M;+i(y„,dy„+i) 



(2.3) 



Of course, the time parameter can be added to the state space as an ad- 
ditional deterministic variable. As an aside, if we consider the sequence 
= (n,X„) on the state space E = Up>o({p} x Ep), we do get a time- 
homogeneous Markov chain with elementary transitions 



M'{{n,x),d{p,y)) = <5n+i(p) Mp(x,dy) (2.4) 



Nevertheless, the Markov kernels (2.4) contain a Dirac measiue, and various 
regularity conditions needed later are not preserved by this state-space en- 
largement. On the other hand, the time parameter here is often interpreted 
as the length of a path. It seems therefore notationally more transparent 
to consider nonhomogeneous path spaces. 




52 2. Feymnan-Kac Formulae 



Definition 2.2.1 The Markov chain in path space 

= ^fo,„] ^E„ = f?[o,nl 

associated with an E'^-valued Markov chain is called the historical pro- 
cess or the path process of the chain 

The motion of the historical process Xn simply consists of extending each 
path of with an elementary M^-transition. We have the synthetic dia- 
gram 

x„_i = 

2.2.3 Stopped Markov chains 

We consider a Markov chain X'„ taking values in some measurable spaces 
with elementary transitions from into E'^. We further 
assume that X' is defined on the canonical speice 



= n X' = (x;)„€n, (p.Ue- j 

where Tn is the natural filtration generated by the random variables X', 
with p <n. 

A finite stopping time T with respect to .F is a random variable taking 
values in N and such that for any x e Eq we have 

{T = n} e !Fn and Pi(T < oo) = 1 

As an aside, let us check that the chain X'^ always satisfies the strong 
Markov property with respect to T. We recall that the cr-field associated 
to T is given by 

Tt = {A€Too ■■ Ar\{T <n} eT„, Vn>0} 

For any A € !Ft, any collection of subsets Bn £ !Fn, and any p > 0 we have 
Px(A n € Bt+ 1, ..., X!p^p € Bt+p)) 



~ Sn>0®®(^-An{T=n} 

X Sb„. X xs.„ 

= Ex( 1 a /bt+iX...xBt+p ^^p)) 




2.2 An Introduction to Markov Chains 53 



We conclude the strong Markov property 

Pi(.X’t+i ^ Bt+1, . . • ,^r+p ^ Bt+p I 

~ Ibt+iX...xBt+p • • • ^T+p(®p-l>^p) 

We let Xn be the stopped path-valued process defined by 

Xn = (n A T, J^[o,nAT]) ^ = Cp_Q({p} X ^'[o,p]) 

with 

4.nAT] = 

n~l 

~ 51 ■^(O.p) lr=P + ^fo,n)lT>n 

p=0 

To describe precisely its elementary transitions we need a few observations. 
Since we have 

{T>n} = {w = (x„)„>o : T(w) > n} e J^n-i = <7(X', p < n) 
there exists a measurable set An C £^[o,n-i] *’^*^*' 

{T>n} = {Xfo,„_i,€A„} (2.5) 

Note that 

{T>n} = {TAn = n} and {T < n + 1} = {T < n} = {Xfo,„, G 
and therefore 

{T A n = n} n = {T = n} 

Using this set-realization of the stopping time we find the decomposition 

= (^T/\n<n + lTAn=nli4® (-X^[0,n])) 

+ lTAn=nli4n+i (^[0,n]) ^[0,n+l]) 

The elementary transitions Mn of Xn are now given for any p < n and 
(xo, . . . , Xp) € J5[q p] by the formula 

Mn+i ((p, (xo, . . . , Xp)), d{q, (yo, . . • , Vq))) 

— ^lp<n “I" (Xo> • • • > ^(p,(*0v.Xp))(^(9’ • • • j 2/g))) 

“h lp=nli4n+i (^0> • • • » Xn) 

x5(n+l.(xo,...,Xn))(d(9» (I/O, . . . ,yn))) K^l(dyn, d^n+l) 




54 2. Feynman-Kac Formulae 



The archetype of such stopped processes is the situation where T repre- 
sents the exit time of a time-homogeneous Markov chain X' from a given 
measurable set ^4 C E'. In this case we have 

{T > n} = € A-} 

Note that, in this situation, the stopped process K^t is itself a Markov 
chain with elementary transitions 

K{x,dy) = Uc(x) Sx{dy) -I- U(x) M'{x,dy) 

We finally mention that the Markov chain X„ introduced above converges 
almost surely to the excursion valued random variable Xt as n tends to 
infinity. Indeed, if we set fix = {finin-+oo Xn = Xt}, then we have Pi(fixn 
{T < n}) = 1 , for every n > 0. By the monotone convergence theorem, 
we conclude that Px(fix) = 1- Also note that, for any / € jBfc(Un>o({w} x 

^[o,n]))- 

|E(/(X„))-E(/(Xt))| = |E((/(X„)-/(Xt)11t>„)| < 211/1|P(T > n) ^ 0 

If T is not almost surely finite, we observe that the above analysis remains 
valid on the event 1t<oo- In this case, we have 

|E(/(X„)1t<oo) - E(/(Xt)1t<oo)1 < 2 |l/||P(n < T < oo) 0 

Random excursion and stopped processes appear to be useful in the Feynman- 
Kac modehng of Dirichlet boundary problems (see Section 12.2.2 and Sec- 
tion 12.2.4). To illustrate this observation we return to the time-homogeneous 
model stopped when it exists the set A described above. We also consider 
the bounded linear mapping 

D.ipe Bb{E') — ► D{ip) e Bt(E') 

defined by 

Z?(^)(x) = ExMXt)) 

It is easy to check that D{f) satisfies the Dirichlet problem 

r M'{D{^)){x) = D{ip){x) for x€A 
\ D{(p){x) = (p{x) for X £ 

In much the same way, if we choose the function 

/(T,Xfo,7^) = vp(X^)+ 9{X;) 

0<p<T 

for some £ 65 (£?'), then we readily check that the function 



h{x) = Ex 



Y. 9{X'p) 



0<p<T 




2.2 An Introduction to Markov Chains 55 



satisfies the (generalized) Poisson problem 

f h{x) = M'{h){x) + g{x) for x € A 
\ h{x) = (p{x) for X € A'^ 

In this situation we have 

|E,(/(X„)) - E*(/(Xr))| < (osc(¥>) + llffll) E*(T lr>n) 0 
for any n > 1, and as soon as Ei(T) < oo. 

2.24 Examples 

Extunple 2.2.1 (Simple random walks) A well know example of Markov 
chain is the simple random walk on the lattice E = Z and defined by 

n 

Xn = Xo + = (2.6) 

»=i 

where (et)»>i is a sequence of independent and identically distributed ran- 
dom variables with common law P(ei = +1) = p and P(ei = -1) = q, 
with p,q € (0,1) and p + q = 1. The above Markov chain is sometimes 
used to model a gambler’s random ruin process starting with Xq euros and 
loosing or wining 1 euros with probability p or q. More generally a simple 
random walk on the d-dimensional lattice E = 1/^ is the Markov chain 
voith elementary transitions M{x , .) = I^|e|=i P(®) ^x+e for some sequence 
(P(e))|e|=i e [0, rvith 53|e|=i p(e) = 1. 

Example 2.2.2 (Birth and death model) The birth and death model 
is a simple Markov chain Xn on E = N where the state 0 is an absorbing 
barrier. Its elementary transitions are given for any x > 0 by 

M{x,dy) = p{x) Sx+i{dy) + q{x) 5x-i{dy) 

where for any x > 0 we have p{x),q{x) € [0, 1] with p{x) + q{x) = 1 and 
the absorbing condition is given by M (0, {0}) = 1. 

Example 2.2.3 (Storage and dam model) Various storage processes a- 
rising in engineering and financial sciences have a Markovian representa- 
tion. A typical toy example is the water flow reservoir. In this context, 
X„ represents the storage level and the amount of water in the dam. The 
inflows, demands, and the percentage loss due to evaporation are repre- 
sented y a non negative random variables J„, Dn and e„. Assuming that 
(7„,D„,e„) one independent variables, the Markov chain X„ evolution is 
given by the recursion 



Xn = ((l-£„)X„_i + (/„-/)„))+ 




56 2. Feynman-Kac Formulae 



Example 2.2.4 (Auto-regressive models) Consider the ^-valued and 
random recurrence relation 

fc=i 



where a = [o(l),...,o(p)] G R** is o deterministic vector, and is a 
collection of independent and real-valued random variables. If we consider 
the p-length vector 

r 1 



Xn 




yf 



then we find that 



x; = ax„_i+w^„ 



from which we easily conclude that Xn is a p-dimensional Markov chain. 



Example 2.2.5 (Queueing model) Consider a single-line service queue 
in which one customer is served per unit of time. The random number of 
new arrivals at time n is specified by a distribution pn on N. Assuming that 
these arrival numbers are independent, the Markov transitions of the queue 
length X„ at time n are given for any j >0 and i> I by the formulae 



F(Xn+l = (i - 1) + i I X„ = i) = /X„+i(j) = P(X „+1 =j\Xn = 0) 

Exfunple 2.2.6 (Urn model) We consider n um with black and white 
balls. At each time a ball is randomly chosen, and returned to the um with 
an extra-ball of the same color. Let X„ = [Bn, 1V„) be the random numbers 
of black and white balls. It is easily checked that X„ is a Markov chain 
taking values in N^, and its transitions are given for any b-\-w>l by 

b w 

M{{b,w), .) = <^( 6 + 1 , ti;) + 

Example 2.2.7 (Branching model) Consider an elementary population 
branching model in which each ith individual member produces at time n 
a random number of offsprings g„. We assume that (gn)i>i,n>o ond in- 
dependent and identically distributed random variables taking values in N, 
and we let E(^^) = g be the expected number of offsprings. The number of 
individuals at time n is a Markov chain starting with a single individual 
Xo = 1, taking values in N, and it has the representation * 9n- 

Note that if g < 1, then we have 

E{Y,Xn) = m-9) and P(5^X„<oo) = l 

n>0 n>0 




2.2 An Introduction to Markov Chains 57 



In this case, X„ tends to 0 almost surely as n tends to infinity. When 
m = 1, Xn is a martingale that converge almost surely to the only pos- 
sible value, namely 0 (oo being clearly excluded). If m> \, recalling that 
P(limp_Koo Xp = 0 \ Xi = j) = P(limp_»oo Xp = 0 | Xq = 1)^, the sequence 
of random variables 

M„ = P( Urn Xp = 0)^" 

p— ►oo 

is a martingale, and we have 

P( lim Mp = 1) = P( Um Xp = 0), P( lim A/p = 0) = P( Urn Xp = oo) 

p-»00 p-¥0O p-*00 p-*00 

Example 2.2.8 (Independent path sequences) Let [/„, n e N, be a 
collection of independent random variables taking values in some measur- 
able state spaces {Cn,Cn)- The sequence of random variables defined by 
X„ = {Uo, ..., Un) € = (Co X ... X Cn) forms a Markov chain. 

Example 2.2.9 (Excursion valued Markov chains) LetYnbea Mar- 
kov chain taking values at each time rt € N on some measurable space 
{Sn, <5„). Also let T„, n € N, 6e a collection of nondecreasing stopping times 
(with respect to the filtration F„ = (t{Yo, ..., Yn) associated with Yn) such 
that the pair sequence (T„, V^n) w ® Markov chain on E = Un>o({ti} x Sn). 
For any 0<p<n we write 

Vjp^^] = 0^q)p<q<n ^ ^[p,n] ~ {Sp X ... X Sn) 

the excursion of Yg from time p up to time n. We can check that the fol- 
lowing sequences are Markov chains: 

{T„,Y[t„_,,t„]) , (Tn-Tn-\,Y[T„_,,T„]) and (T„ - T„_i,yr„) 

Let p > 1 be a fixed integer parameter. We easily check that the random 
sequence 

is a Markov chain. For any nondecreasing sequence of integers tn, n 6 N, 
the random sequence Xn = also forms a Markov 

chain. 



Example 2.2.10 (Restricted Markov chains) Let Y„ be a time-homo- 
geneous Markov chain on some measurable space {E, £). We further assume 
that A€€ is a recurrent set (in the sense that Y„ visits A infinitely often) 
and Yq€ A. We define the nondecreasing sequence of returns times to A 

Tn = inf {p > T„_i : Yp€ A} with To = 0 

The sequence X„ = Yt„ forms a time-homogeneous Markov chain taking 
values in A and its elementary transition is defined by 

M{x,dy) = P,(yT, € dy) 




58 2. Feynman-Kac Formulae 

2.3 Description of the Models 

From the discussion given in the introduction of the book, we see that the 
Feynman-Kac path measures can be sought in many different ways. If we 
want to capture the full force of these models, it is therefore necessary to 
undertake their analysis in an abstract and nonhomogeneous setting. In 
this section, we introduce an abstract class of Feynman-Kac models in gen- 
eral nonhomogeneous state spaces. These models are built with two main 
ingredients: a Markov chain associated with a reference probability mea- 
sure and a sequence of potential functions related to the mass repartition 
of the Feynman-Kac measures. Our first task is to introduce these two 
mathematical objects. 

Let (En,Sn), n € N, be a collection of measurable spaces. We consider 
a collection of Markov transitions M„(x„_i,dx„) from En-i into En and 
a given probability measure /z € V{Eo). We associate with the latter a 
nonhomogeneous Markov chain 



^ = n T = (JF„)„gN, X = (Xn)„eN, 

taking values at each time n on with elementary transitions Mn and 
initial distribution fi. When the initial distribution fi = 6x is concentrated 
at a single point x € £q> we simplify notation and we write Pz instead of 
P«i- We use the notation E^(.) and Ez(.) for the expectations with respect 
to P^ and Pz. In this simplified notation, we notice that 

P^(.) = f fl{dx) Pz(.) 

J Eq 

and for any Fn € Bb{E[o^n]) we have 

• • • » -^n)) “ / E n(XQ, . . . , , 3?n)) 

*'^lo,n) 

with the distribution on -E[o,n] given by 

P/i,n(^(^0j • • • ) ^n)) ” dX\) . . . i, dXfi) 

Let Gn : [0, oo) be a given collection of bounded and £n-nieasurable 

nonnegative functions such that for any n 6 N 



\p=o 

Next we present the definitions of the Feynman-Kac models associated with 
the pair potential/kemel (Gn, M„). We start with the traditional descrip- 
tion of a path-space model. In reference to filtering literature, we adopt the 
following terminology. 



>0 




2.3 Description of the Models 59 



Definition 2.3.1 The Feynman-Kac prediction and updated path models 
associated with the pair {Gn,Mn) (and the initial distribution p) are the 
sequence of path measures defined respectively by 

) ®n)) ~ ^n,n{d{Xo, . . . , Xn)) 

Qp,n(^(®0) • • ■ > ®n)) ~ ^ S ^p(®p) j Pp,n(ti(®0) • • • t ®n)) 

■2n (,p=o J 

(2.7) 

for any n G N. The normalizing constants 

-Zn = Ep^nGp(Xp)^ and Gp(Xp) j 

are also often called the partition functions. 

The measures Qp,„ and are alternatively defined for any test function 
F„ € Bj,(F(o,„]) by the formulae 



1 / 

Qp.n(F„) = ^ Ep Fn(Xo, . . . , X„) n Gp(Xp) 

" V p=0 

QM,n(F„) = i Em ( f'niXo, . ,Xn)fl Gp(Xp) 

This “weak” description of the measures in terms of the expectation E^(.) 
with respect to the law of a reference Markov chain is more tractable 
than the previous one. Definition 2.3.1 shows the correspondence between 
Boltzmann-Gibbs and Feynman-Kac models. The difference between these 
two models concerns the role of the time parameter. In contrast to Boltz- 
mann-Gibbs measures, the Feynman-Kac models have a particular dynamic 
structure. To get one step further in this discussion, it is convenient to in- 
troduce the flow of the time marginals. 

Definition 2.3.2 The sequence of bounded nonnegative measures 7n and 
% on En defined for any fn G Bb{En) by 



n— 1 



7n(/n)=Ej/n(Xn) W 

\ p=0 > 



7n(/n) = 



V p=0 



Gp{X,) 



and 




60 2. Feymnan-Kac Formulae 



are respectively called the unnormalized prediction and the updated Feynman- 
Kac model associated with the pair {Gn, Mn). The sequence of distributions 
T)n and r\n on E„ defined for any fn e Bb{En) as 

»7n(/n) = 7n(/n)/7n(l) and fjn(/n) = 7n(/n)/%(l) (2.8) 

are respectively ailed the normalized prediction and updated Feynman-Kac 
model associated with the pair (Gn> M„). 

There exist several ways to extend these formulae to a more general po- 
tential. The interested reader is referred to the book by A.S. Sznitman [295], 
and Section X.ll in M. Reeds and B. Simon [277], or the book by M. Na- 
gasawa [254]. 

To better connect these objects, it is convenient to make a couple of 
remarks. First we observe that for n = 0 we have % = 7 o = M € P(Eo)- 
On the other hand, we have for any n € N 

= 2n+l = 7n(l) = 7n+l(l) = E J f[ 

\p=o 

In this connection, we also observe that 7n(l) < llGn|| 7n-i(l), from which 
we conclude that 



2fi — 7n(l) > 0 VO < p < n 7p(1) ^ 0 



We end this section with an important formula that relates the “imnor- 
malized” models 7n and % with the Feynman-Kac distribution flow r/p, 
p <n. We start by noting that 



JnifnGn) = Ep 



fn(Xn) Gn(Xn) 




= 7n(/n) 



By direct inspection, this yields 

^ ( f ) — 7n(/n^n) _ 7n(/nG^n)/7n(I) _ VnifnGn) 

7n(Gn) “ 7n(Gn)/7n(l) ^ r/n(Gn) 

Thus, we are led to introduce the following transformation. 

Definition 2.3.3 The Boltzmann-Gibbs transformation associated with a 
potential function Gn on {En^Sn) is the mapping 



• P ^ ^ ^n(^) ^ 'Pn{E!n) 

from the subset Vn{En) = € V{En) ; rj{Gn) > 0} into itself and defined 

for any p € V{En) by the Boltzmann-Gibbs measure 

^n(^)(dXn) = rji(G~) vi^n) 




2.4 Structural Stability Properties 61 



Note that the Boltzmann-Gibbs treuisformation is well-defined as soon as 
Gn is not the null function. In this notation, we see that 

Vn — ^n(t/n) (2-9) 

In the reverse tmgle, the distribution flow t)„ is connected to rin-i by the 
formula 

Vn = fjn-iMn (2.10) 

To see this claim we simply use the Markov property of X„ in the definition 
of Tfn to check that 

/ n-l 

7n(/n) = E^(/„(X„) 

p=0 

n-l \ 

Mn(/„)(X„_i) =7n-l(M„(/„)) 

P=o / 

Prom this we find that 

Vn{fn) — 7n(/n)/7n(l) 

= 7n-i(Mn(/n))/7n-l(l) = Vn^lMnifn) 

We are now in a position to state the annoimced formula, whose proof is 
left as a simple exercise. 

Proposition 2.3.1 For any n G N and for any fn G Bb{En), we have 

n-l n 

7n(/n) = Vnifn) ^d Jn{fn) = Vn{fn) Vpi^p) 

p=0 p=0 




2.4 Structural Stability Properties 

Feynman-Kac models have various structural stability properties. In Sec- 
tion 2.4.1, we will see that the path measures and their time marginals 
have the same algebraic structure, provided the potential energy of a path 
only depends on the current state. We will use this property to define 
genealogical tree-based approximations of Feynman-Kac path measures. In 
Section 2.4.2, we describe the class of Feynman-Kac fiows that is connected 
to a given reference model by a change of probability measure. Finally, Sec- 
tion 2.4.3 connects updated and prediction models. 

These three structural stabiUty properties are not of pure mathematical 
interest. They confer to these formulae a stable and rich algebraic structure 
that allows direct transfer of many known results on marginal or prediction 
models to path-space formulae or updated fiows. 




62 2. Feynman-Kac Formulae 



2.4- i Path Space and Marginal Models 

Let Xn be a nonbomogeneous Markov chain with Markov transitions Mn+i 
from En into En+\ and initial distribution p € P{Eo). Also let Gn be a 
given collection of bounded measurable nonnegative functions on E„ such 
that for any n € N, E^(Ilp=o where E;j(.) represents the 
expectation with respect to the distribution of Xn- We consider the 
Feynman-Kac path measures associated with the pair (G„, M„) and 
defined by 

Q/i,n(d(xO) • • • 1 ®n)) = ^ tjlp{Xp) ^ P ^,n(d(Xo, . . . , Xn)) 

where P^,„ denotes the probability measure of the path (Aq, . . . , X„) 
P#*,n(d(xo,...,x„)) = /i(dxo) Mi(xo,dxi)...M„(xn-i,dx„) 



We further suppose that 



= A-fo,„](= {X’o , . . . , a;)) € £;„ = £;[o,„](= x . . . x e’j) 

represents the path process associated with an £^-valued Markov chain X'^ 
with Markov transitions from E'„ into K^v By construction, we 
notice that the initial random variable Xq = Xq is distributed according 
to p £ V{Eo)i= P(f^o))- this situation, the imnormalized prediction 
measures 7 „ on E„ have the form 

/ n-l 

7n(/n) = E^ /„(Xfo,„,) n Gp(X(o.p,) 

\ p=0 

As a result, the corresponding normalized distributions rjn are given by 



T]fi{d{XQy . . . , Xy^)) — 



7n(l) 







where PJ, „ = P^ o (X^, ^|) ^ stands for the distribution of the path of X' 
from the origin up to time n. In other words 

Pl..n(d(a;o, • • • , O) = p{dx'o)M[{x'o, dx[)... dx^) 

Next we examine the situation where the potential functions Gn only de- 
pend on the terminal point of the path. That is, we have that 

G„ : x„ = {x'a, . . . ,x'„) € En ^ G„(x'o, ...,x'J = G'„(x'J 

In this case, we readily check that the n-time marginal distribution rjn of 
the path measmre coincides with the Feynman-Kac path measure ^ 




2.4 Structural Stability Properties 63 



associated with the pair M^). More precisely, we have that 



r/n(d(Xo, . . . t^n)) 



IT ’'“•"'‘'w 

^ K p=0 J 

( 2 . 11 ) 



with the same partition functions 2^ = Z„ = 7n(l) (> 0). 

Moreover, their nth time marginals are again defined for any test 
function € Bb{E^) by the Feynman-Kac formulae 



n— 1 



V'Mn)=<ifn)hni^) with iMn) = fLiK) U^piK) 

\ p=o > 



We will use these structural properties of Feynman-Kac models in several 
places in this book. Given a reference Feynman-Kac distribution model 
r/n, we will use the notation Qn to represent the corresponding path- 
distribution model, and whenever rjn is already a path measure we will 
denote by the marginal distribution flow. We summarize the preceding 
discussion with the following synthetic diagram: 

path measure marginal measure 

Qn < Vn ^ Vn 



2.4-^ Change of Reference Probability Measures 

In this section, we describe a class of Feynman-Kac models that are equiv- 
alent by a change of reference measure on the canonical space 

(n = J] E’„, r = x' = (x;)„eN) 

n>0 

where {E'„,S'„), n € N, is a given collection of measurable spaces. Let y, 
and ]i be two distributions on Eq with g. Ji. Also let M^(x„_i,dxn) 
and A/„(x„_i,dx„) be collections of Markov transitions from E'„_i into 
E'„ such that for any x„_i € E'^_i 

•^n(^n— li dXn) M ^{Xn—\,dXfj^ 

As in the beginning of Section 2.3, we associate with the pidrs {n, M^) and 
{ji, ~M'^) the laws and Pj? on the canonical space of two nonhomogeneous 
Markov chedns with the n-time margintils 







64 2. Feynman-Kac Formulae 



given by 



P^_„(d(Zo,...,Xn)) = fi{dxo)M[{XQ,dXi)...Mn{Xn-udXn) 

%,nid{Xo,---,Xn)) = Ji{dXo)Mi{xo,dXi)...M„{Xn-l,dXn) 

Under our assumptions, the measure is locally absolutely continuous 
— / — / 

with respect to and for each n > 0 and for P^ „ almost every sequence 
(zo, • • • , a:„) € „j we have 






dP, 






^(zo,...,Zn) = ^(zo) jj 



P=1 



dMp(zp_x, .) 



We use the notation Ep(.) and Ejj(.) for the expectation with respect to 
Pp and Pp. We denote by 

= -^[0,n) ^ En = £?fo,„] 



the historical process of the chain and by M„ and Mn the corresponding 
Markov transitions under the reference measures P^ and P^. Finally, let Gn 
be a given collection of potential functions 



Gn : (xo, ...,Xn)eEn = „j -4 G„(Zo, . . . , Z„) € [0, oo) 
such that for any n > 0 



%{1) = Ep (n = Em (n <^p(^[0.p])) 



>0 



We recall that the updated Feynman-Kac distribution flow model associ- 
ated with the pair (Gn> Mi) is given for any /n G Bh{En) by 



Vnifn) = 7 n(/n)/ 7 n(l) 



with 

/ n > 

7n(/n) = E„ no,{x(„.,|) 

\ P=0 J 

By construction, we have that 



7 n(/n)=Ejr(^/n(Xfo.„]) fl Gp(A:fo,p]) j 
with the potential function Gn on Eq x x defined by 

Gn(zo, . . . ,Z„) = Gn(zo, . . . ,Z„) X (z») 

nv^n— •) 




2.4 Structural Stability Properties 65 



and the convention for n = 0, Mq{x-\, .) = /i, and Mq{x-i, .) = /i so that 
Go(xo) = Go(xo) ^(®o) 

FVom these observations, we prove easily the following proposition. 

Proposition 2.4.1 The prediction and updated Feynman-Kac models as- 
sociated with the pairs (GmM„) and (G„, M„) coincide. 

Each representation of a Feynman-Kac model in terms of a pair (Gn, M„) 
will correspond to a different physical interpretation and will lead to dif- 
ferent particle approximation models. As we shall see in the forthcoming 
sections, M„ correspond to the elementary transitions of a random Markov 
particle evolving in an environment with absorbing potential Gn- 
We end this section we an elementary Feynman-Kac formula which allows 
to change the potential functions without changing the imderlying Markov 
chain. Let G„ and G„ be a sequence of positive potential functions on the 
state spaces ^[o,n]- further assume that the ratio function G„/G„ is 
a well defined bounded function. We associate to the pair (G„,G„) the 
transformations firom B6 (Ejq ^]) into itself defined by 



^^g(/n)(®0) • • • ) Xn) — fn(Xo> • • • > Xn) 



n 



n 



Gp(xp,...,Xp) 

Gp{xo, . . . , Xp) 



It is now readily checked that for any / € „]) we have the equivalent 

formulations 

^(/n) =def. Ep n <^p(Afo,p^ 

Rrom the numerical point of view, the choice of the pair (Gn,M„) is 
related to some physical knowledge of the probability mass evolution of 
the Feynman-Kac models. In some instances, a judicious choice of Markov 
transitions will drive the particles in the regions with high probability mass. 
Proposition 2.4.1 gives the way to change the potential functions accord- 
ingly. In other instances, the choice of the absorbing potential functions may 
induce a too crude selection transition. In this case formula (2.12) gives the 
way to choose the potential function with changing the test function. 



2.4-3 Updated and Prediction Flow Models 

In this short section, we discuss the connections between the updated and 
prediction flow models introduced in Section 2.3. We further require that 
the pairs (Gn, M„) satisfy for any x„ € £„ and n e N the following condi- 
tion: ^ 

Gn{Xn) ~ Afn+l(Gn+l)(2:n) € (0, Oo) 



(2.13) 




66 2. Feymnan-Kac Formulae 



In this situation, the integral operators 



linin') — 



■^n(^n— 1) ^n) ^n(^n) 



(2.14) 



are well-defined Markov kernels from En-i into E„- Also notice that in this 
case fjo = ’i'o(»Jo) is a well-defined distribution on Eq provided r)o{Go) > 0. 

We associate with rjo and Mn the probability distribution of a canon- 
ical Markov chain with initial distribution ^ jmd elementary transitions 
M„. By construction, the nth time marginals Pfib.n of P^ are given by 



• • • ) ^n)) — %(dXo)Jl4i(Xo, d®l) • • • Afn(®n— li (2-15) 



The Feynman-Kac path measures associated with these new pairs of po- 
tential/kemel (G„, Mn) are now defined by 



(p=o J 



(d(xo, . . . , Xn)) 



Under condition (2.13), we prove that the normalizing constants Z„ are 
always strictly positive. It is also easily proved that Qijo.n can alternatively 
be written in terms of the original pairs (Gn, M„) with the formula 

Q»7o>n(<f(^0> • • • > ®n)) = I JJ Gp{Xp) > P,jg^„(d(xo, . . . , X„)) 

(p=o J 



and the normalizing constant Zn = ?}o(^o) > 0. FVom this simple 

observation, we prove the following proposition. 

Proposition 2.4.2 The updated Feynman-Kac models Qt^,n and rfn as- 
sociated with the pair (G„, M„) coincide with the prediction Feynman-Kac 
models associated with the pair {Gn, Mn) and starting at rjo = ’i'o(»?o)- 
have for any n € N, /„ € Bb{En), and F„ G Bj,(£’[o,„]) 

^ E^(Fn(Xo,...,Xn) n^ZoGpjXp)) 

Gp(Xp)) 

and fjnifn) = E%(/n(X„) Hplo Gp(Xp))/E%(np:o GpiXp)). 

This stability property indicates one way to transfer results between up- 
dated and prediction models. In this connection, we already mentioned 
that in some instances it is more judicious to interpret fjn as the prediction 
flow associated with the pair (Gn, Mi) and starting at fjo. To illustrate this 
assertion, we examine hereafter the situation where the potential function 
Gn may take some null values and we set 

“ {^n ^ i G^n(^n) ^ 0} 




2.4 Structural Stability Properties 67 



It may happen that the set En is not M„^accessible from any point in 
En-i. In this case, we may have M„(x„_i, £?„) = 0 for some x„_i € E„-i 
and therefore M„(G„)(Xn-i) = 0. In this situation, the Markov kernel 
M„ introduced in (2.14) is not well-defined on the whole set £'„_i. This 
irregularity property creates some technical difficulties in defining properly 
the dynamical structure of the corresponding Feynman-Kac models. Next 
we weaken (2.13) and we consider the condition 

(•^) V Xfx € En Mn+l{Xni En+l) > 0 and T]q{Eq) > 0 (2.16) 

In words, this condition says that the set En+\ is accessible from any point 
in En- In this case, we readily check that condition (2.13) is only met for 
any x„ € More precisely, we have for any n G N and x„ G 

~ Mn+i(Gfi+l)(^n) G (0) OO) 

More interestingly, in this situation the integral operators M„ defined for 
any x„_i G En-i by 



M„(x, 



■Mn(Xn— 1, dXft) GnjXn) 

Mn(Gn)(Xn-l) 



are well-defined Markov kernels from En-i into En- Finally we note that, 
for any 7/0 G T{Eo) with t]q{Eo) > 0, the updated measure rjo = ’J'o(»?o) 
is such that fjo{Eo) = 1. From the preceding discussion we see that the 
distributions Pjjj, defined in (2.15) are such that Pjjj, (£^[o,n]) = 1. In addition, 
the distribution on f2„ = £^(o,n] defined by 

P%.r»(<^(®o, • • • ,x„)) = ffo{dxo)Mi{xo,dxi) . . . M„(x„_i,dx„) 



can be extended by consistency arguments to the whole canonical space 



( = fynes, X = (X„)„eN 1 



Under the canonical process is a Markov chedn Xn with initial distri- 
bution ^ and elementary transitions Afn from En-i into En- Summarizing 
the discussion above, we get the following proposition. 

Proposition 2.4.3 When the accessibility condition (>1) is met, the up- 
dated Feynman-Kac measures Qt^.n 6 ^(■E[o,n]) and T}n G V{En) can be 
interpreted as the prediction models associate with the pair potential/kemel 
{Gn^Mn) on the restricted state spaces (En^Sn)- 




68 2. Feynman-Kac Formulae 



For instance, if we choose Gn = If , then under the canonical process 
is the ^local’^ restriction of the original chain with transition Mn to the 
sets En- That is, we have for any Xn-i 6 En-i 



^n{^n-hd>Xn) — 



Afn(^n— l>^n) 
Afn(^n— 1> ^n) 



2.5 Distribution Flows Models 

In previous sections, we have introduced a variety of Feynman-Kac model- 
ing techniques. We have underlined several structural stability properties 
8md the interplay between these distributions. In the present section, we 
provide two different physical interpretations. The first one is the tradi- 
tional trapping interpretation. The leading idea consists in turning a sub- 
Markov property into the Markov situation by adding a cemetery or coffin 
state to the state spaces. In this context, the Feynman-Kac models repre- 
sent the conditional probabilities of a nonabsorbed Markov particle evolving 
in an environment with obstacles. One drawback of this physical interpre- 
tation is that the resulting killed Markov model is defined on a different 
canonical probability space and the particle motion is instantly stopped as 
soon as it is trapped by an obstacle. 

In the second part of this section, we adopt a different point of view. 
This alternative physical interpretation is based on measure-valued and in- 
teracting processes ideas. Loosely speaking, instead of adding an auxiliary 
cemetery point to the state space, when the particle dies then instantly a 
new particle is created at a site randomly chosen according to the current 
distribution of the model. This interacting jump process is defined on the 
same original canonical space by changing the reference probability mea- 
sure by an appropriate McKean measure. Under the latter, the Feynman- 
Kac measures represent the distribution laws of a birth and death process. 
This interacting process interpretation will be the stepping stone of om 
construction of particle and genealogical approximation models. 

To clarify the presentation, we will assume that the potential functions 
are strictly positive; that is, we have for any x„ € E„, G„(x„) > 0. As 
noted in Section 2.4.3, when the potential functions G„ are not strictly 
positive, we have to carry out a more careful analysis. In this situation, it 
is often more convenient and natural to work on the state spaces 

~ {®n ^ Eft i Gn{Xn) > 0} 
with the potential functions 

^ Gn{Xn) ~ ■Mn+l(Gn-|-l)(®n) 

This strategy works when the accessibility condition (v4) presented in (2.16) 
is met. In the final part of this section, we will discuss the difficulties 




2.5 Distribution Flows Models 69 



that can arise when (^) is not met. The main advantage of the preceding 
condition i^that the updated model ffn can be regarded as a distribution 
flow on V{En)- More precisely, it can be interpreted as a prediction model 
with transitions Mn from En-i into En and potential functions Gn on E^ 
such that for any Xn ^ En, Gn(xn) > 0. On the other hand, since the 
potential functions Gn are assumed to be bounded, we can replace in the 
deflnition of the normalized measures ijn, % the functions Gn by Gn/\\Gn\\ 
without altering their nature. From all these observations, we see that there 
is no great loss of generality in considering potential functions Gn in the 
half unit ball. Unless otherwise stated, in this section we will assume that 
0 < Gn{Xn) ^ 1* 

Definition 2.5.1 We identify the potential functions Gn tvith the Boltz- 
mann multiplicative operator Gn on Bb{En) defined for any fn € Bb{En) 
and Xn € En by the equation 

Gn{fn){Xn) ~ G^n(^n) fn{Xn) 

We can alternatively see as the integral operator on En defined by 



Gn{Xnidyn) — Gn{Xn) ^Xn(^J/n) 



In this connection, we see that Gn is a sub-Markov kernel 
Gn{Xni En) = Gn{Xn) ^ 1 

We continue oiu: program, noting that the positive measures 7n satisfy a 
linear equation of the form 



7n — 7n— iQn (2*17) 

where Qn are the bounded nonnegative operators on Bb{En) defined by 
Qn = Gn-iMn- To emphasize the role of each quantity we also note that 

Qnifn) — (^n—1 ^n{fn) and 0 < Qn(I) ” Gn—l ^ 1 

The right-hand side in the last display shows that the sub-Markov property 
of Gn is transferred to Qn- In the next two propositions, we have collected 
some structural properties of Feynman-Kac flows. 

Proposition 2.5.1 The unnormalized prediction and updated Feynman- 
Kac measures 7n and % associated with the pair (Gn, Afn) satisfy the lin- 
ear recursive equations 7„ = 7n-iQn and % = %-iQn with the bounded 
nonnegative operators defined 

Qn = Qn-\Mn and Qn = MnQn 




70 2. Feynman-Kac Formulae 



Proposition 2.5.2 The normalized prediction and updated Feynman-Kac 
distributions rjn and rfn associated with the pair {Gn, M») satisfy the nonlin- 
ear recursive equations % = $n(»?n-i) <^nd fjn = the map- 

pings <^nd from V{En-i) into V{En) defined for any q € V{En-\) 
by 

$„(»?) = and $„(;;) = = $„_i(t7)M„ 

In the last display, <^nd denote the Boltzmann-Gibbs transformations 
on V{En) given by 

^n{p){dXn) = and $n(M)(d®n) = 

p(G„) 

The pair potentials/kemels (G„,M„) ore defined for any /„ e Bb{En) by 
the formulae 

a„ = M„+i(G„+i) and M„(/„) = M„(/„G„)/M„(G„) 



We end this section with a discussion of the case where G„ may take null 
values. First, we recall that the accessibility condition (<4) ensures that the 
Feynman-Kac normalizing constants are always well-defined. To prove this 
assertion, we can use the inte^re^ion of the flow r}„ as the prediction 
flow associated with the pair (G„, M„) and check that 

^fn(X„) = %(C'o) EfJo ^/n(X„) n > 0 

This shows in particular that for any n G N we have 

Vn ^ 'Pni^n) = {V ^ 'Pi^n) ] Vi^^n) > 0} 

and the Feynman-Kac flow is a well-deflned two-step updating/prediction 
model 



updating ^ prediction 

Vn ^ Pn{^n) ^ Vn ^ Pi^n) ^ ^ (-^n+l) 

When the accessibility condition {A) is not met, then the set £n+i is not 
M„ 4 .i-accessible from any point in En, and it may happen that 

^nAfn^-i(Gn+l) ” ^n+l(G^n+l) “ 0 

In this situation, the Feynman-Kac flow rjn is well-defined up to the first 
time r we have rjr{Gr) = 0. At time r, the measure r)r cannot be updated 




2.5 Distribution Flows Models 71 



anymore. Recalling that r\r{Gr) = 7t+i(1)/7t(1), we also see that t coin- 
cides with the first time the Feynman-Kac normalizing constants become 
null, that is 

7r+l(l)=i^^n^p(Xp)j=0 



2.5.1 Killing Interpretation 

The first way to tium the sub-Markovian kernels Q„ into the Markov case 
consists in adding a common cemetery point c to the state spaces En and 
in extending the various quantities as follows. 

• The test functions /„ € Bb{E„) and the potenti6d functions Gn are 
first extended to E‘=Enl) {c} by setting /„(c) = 0 = G„(c). 

• The Markov transitions from En into £^n+i extended to 

transitions Mn+i from E^ into E^^i by setting .) = and 

for each Xn G En Mn+iiXn^dXn+l) — (Xn, dXfj+l). 

• Finally, the Markov extension of Qn on En U {c} is given by 

Oni^n, dyn) = G„(x„) (5*„ (dj/„) -I- (1 - G„(x„)) 6c{dyn) (2.18) 

Note that for any x„_i € E„-i and An € E„ we have 

1’ ~ Gn— l(Xn— l) Mn{Xn—lyAn) 

The corresponding Markov chain 

= [I = iK)n> 0 ,X = (X„)n>0,P^ 

n>0 

with initial distribution /x G V{Eo) and elementary transitions 

QUi = ( 2 . 19 ) 

can be regarded as a Markov particle evolving in an environment with 
absorbing obstacles related to potential functions Gn- In view of (2.19), we 
see that the motion is decomposed into two separate killing/exploration 
transitions: 

killing ^ cxplorfttion 
Xn > Xn ► Xn^l 

This killing/exploration mechanism represents the overlapping of the two 
elementary transitions Gn ^n- They are defined as follows: 




72 2. Feynman-Kac Formulae 



• Killing: If Xn = c, we set Xn = c. Otherwise the particle Xn is still 
alive. In this case, with a probabiUty Gn(Xn) it remains in the same 
site so that X„ = X„, and with a probability 1 - G„(X„) it is killed 
and we set X„ = c. 

• Exploration: First, since there is probably no life after^eath when 
the particle has been kiUed, we have X„ = c and we set Xp = Xp = c 
for any p > n. Otherwise the particle X„ € £„ evolves to a new 
location X„+i in En+i randomly chosen according to the distribution 

^n+l(Xny •)• 

In this physical interpretation, the Feynman-Kac distribution flows if„ and 
T)n represent the conditional distributions of a nonabsorbed Markov parti- 
cle. To see this claim, we denote by T the time at which the particle has 
been killed ^ 

T = inf {n > 0 ; = c} 

By construction, we have 
P^(T>n) 

= P^(XoG£^,...,X„eEn) 

= / p{dxo) Go{xo) Mi(xo, dxi) . . . dx„)G„(x„) 

JEoX...xEn 

= E«(f[Gp(^f) 

\p=o 

This also shows that the normalizing constants of rfn and pn represent 
respectively the probability for the particle to be killed at a time strictly 
greater than or at least equal to n. In other words, we have that 

7 „(l) = P^(T>n) and 7 n(l) = P^(T > n) 

Similar arguments yield that 

7n(/n)=E^(/n(X„)lr>„) and 7n(/n) = E^(/n(X„) lr>n) 

where E® (.) is the expectation with respect to P^(.). From the observations 
above, we conclude that 

vMn) = E^(/„(X„) I T > n) and »?„(/„) = E' (/„(X„) | T > n) 

To get one step further in our discussion, it is convenient to introduce some 
additional terminology. 

Definition 2.5.2 The subsets G“^((0, 1)) and G^^(O) ore called respec- 
tively the sets of soft and hard obstacles (at time n). 




2.5 Distribution Flows Models 73 



By construction, a particle entering into a hard obstacle is instantly killed. 
When it enters into a soft obstacle, its lifetime decreases. 

Let En = E„- G"^(0), and suppose the accessibility condition (A) in- 
troduced on page 67 is met. Let (Gn, M„) be the restrictions to the state 
spaces E„ of the pair potentials/kernels defined in Proposition 2.5.2. From 
the discussion given in Section 2.4.3, the updated Feynman-Kac model 
associated with the pair (Gn,A/„) with initial distribution fjo coincides 
with the prediction model associated with the pair (Gn,Mn) with initial 
distribution t/q. Furthermore, if we replace in the grec^ii^construction 
the mathematical objects (T/o,EnjGn,Mn) by (%,F?n,Gn,Mn), we define 
a particle motion in an absorbing medium with no hard obstacles. Loosely 
speaking, this strategy consists in replacing the hard obstacles by repulsive 
obstacles. It is instructive to examine the situation where Gn = Ig • In this 
case, the Feynman-Kac model associated with (t;o, Gn, Mn) corresponds to 
a particle motion in an absorbing medium with pur^harf obstacle sets Eny 
while the Feynman-Kac model associated with (rjb, Gn, Mn) corresponds to 
a particle motion in an absorbing medium with only soft obstacles related 
to the potential functions defined for any Xn € En by 

^n(^n) ~ Mn+i(Gn-fl)(Xn) = P^(Xn-|-l € En-i-1 | -Xn ^ ^n) 

Note that the less chances we have to enter in En^i from some region the 
more stringent is the obstacle. 

2,5,2 Interacting Process Interpretation 

In interacting process literature, Feynman-Kac fiows are alternatively seen 
as a nonlinear measure-valued process. For instance, the distribution se- 
quence rjn defined in (2.8) is regarded as a solution of nonlinear recursive 
equations of the form 

^n+l = 'HnEn-^-l.rjn (2.20) 

with the initial distribution t]q = p £ P{Eq) and a collection of Markov 
kernels from En into As mentioned in the introduction the 

choice of Kn-^i.^x is far from being unique. From (2.9) and (2.10), we easily 
see that we can choose 



En-j-l,rj = Sn,fjMn-\-l ( 2 . 21 ) 

with the Markov kernels Sn,rj on En defined by 

‘^n,r?(^n, dj/n) ~ Gn(Xn) ^Xn(dj/n) "I" (1 "" Gn{Xn)) ^n(^)(dj/n) (2.22) 

Note that the evolution equation corresponding to this choice of kernels is 
decomposed into two separate transitions 

Vn ^ ^ Vn “ Vn^n,rjn ^ ^n+1 ~ (2.23) 




74 2. Feynman-Kac Formulae 



In contrast to the first killing interpretation given in Section 2.5.1, we 
have here turned the sub-Markovian kernel into the Markov case in a 
nonlinear way by replacing the Dirac measure on the cemetery point c by 
the Gibbs-Boltzmann distribution 

The nonlinear measure-valued process (2.23) can be interpreted as the 
evolution of the laws of a nonhomogeneous Markov chain with a two-step 
transition 5n,fj„Af„+i that depends on the distribution ij„ of the current 
value of the chmn. The precise mathematical description of this Markovian 
interpretation is simply based on the construction of a judicious probability 
measm-e on the canonical space. These mathematical objects are ciurrently 
used in the literatme on mean field interacting processes. For continuous 
time models, the desired distribution on the canonical space cannot be de- 
scribed explicitly. We usually need to resort to some “fixed point argument” 
to ensure the existence and uniqueness of these measures. In the discrete 
time situation, they can be described in a very simple and natural way in 
terms of the transitions We give next an abstract formulation of these 

probability measures, and we check that they satisfy all the requirements. 

Definition 2.5.3 The McKean measure associated with a collection of 
Markov kernels {Kn+i,r))t)ev{E„),neK initial distribution % e 'P(£?o) w 
a probability measure on the canonical space 

D = En, T — (J>i)n€N) ^ — (-X^n)n€N 
n>0 

Its n-time marginals K,^,n = o {Xq, ■■■, are given by 

• • • >®n)) = *7o(dxo) •Ki,t;o(Xo, dxi) ... Kn,*j„_,(Xn-l,dXn) 

(2.24) 

where € V{En) is the solution of the recursive equation 

*7n+l — ^nKn+l,t}„ 

with the initial distribution %. 

By construction, we have for any n € N in a synthetic integral form 
®^7o(('^0> • • • > ^n) € d(xo, • . . , Xn)) 

— rfo{dxoi) K\^rjg{Xo,dXi) ... Kn,rj„_i(Xn-l,dXn) 

This clearly implies that under K,^ the canonical Markov chain X„ has 
elementary transitions and initial distribution tjo. We denote by 

E,^(.) the expectation with respect to K,^(.). To prove that is the law 




2.5 Distribution Flows Models 75 



of Xn under K,^, we simply check that for any test function /„ € Bh{En) 



K>{fn{Xr.)) 



-L 



EoX...xEn 



/n(^n) Vo{dXo) Ki^ffQ^XQ^dXi) ... Kfi^f^^_^[Xfi—i^dXfi) 




Vn-l^n,f}n-i{^^n) — Vn{fn) 



Nonlinear measure-valued equations are usually not attached to a partic- 
ular McKean measure. The choice of the latter depends on the physical 
interpretation of the model. To distinguish these possibly different choices 
of models, we will adopt the following terminology. 



Definition 2.5.4 Suppose rjn € V{En) is a sequence of distributions satis- 
fying a recursive equation r/n+i = ^n-i-i(^n) for some measurable mappings 
$n+i : 'P(En) 'P{En+i)y n € N. i4 Collection of Markov kernels ATn+i,ry, 
7) € V{En)f n G N, satisfying the compatibility condition 



for any rj e V{En) and n e N is called the McKean interpretation of the 
flowTjn- 



In comparison with (2.19), under the motion of the canonical model 
Xn Xn-^i is the overlapping of an interacting jump and an exploration 
transition 



Xn 



interacting 



jump ^ 
>Xn 



exploration 



(2.25) 



These mechanisms are defined as follows: 



• Interacting jump: Given the position and the -distribution r/n 
at time n of the particle Xn, it performs a jump to a new site ran- 
domly chosen according to the distribution 



5n,r,JXn, .) = Gn{Xn) Sx^ + (1 ~ Gn{Xn)) ^n{Vn) 



In other words, with a probability Gn{Xn) the particle remains in the 
same site, and we set Xn = Xn- Otherwise it jumps to a new location 
randomly chosen according to the Boltzmann-Gibbs distribution 



^ n{Vn){d>Xn) 



Gn{Xn) VnidXn) 



Notice that during this transition the particle is attracted by regions 
with high potential in accordance with the updating transformation 
of the model. 



• Exploration: The exploration transition coincides with that of the 
killed particle model. Dming this stage, the particle Xn evolves to a 
new site Xn+i randomly chosen according to Mn+i(Xn, .). 




76 2. Feynman-Kac Formulae 



2.5.S McKean Models 

In this section, we discuss in some detail the nonuniqueness of McKean 
interpretations of Feynman-Kac models. In Section 2.5.2, we have already 
seen that different choices of Markov kernels Kn,i) satisfying the compatibil- 
ity condition r) Kn+i,r, = ^n+i(»?) (= ^n{n)Mn+i) correspond to different 
McKean interpretations. One natural strategy to construct compatible ker- 
nels is to find a collection of selection transitions Sn,ri on En such that 



If we set Kn+i,ri = 5n,f,M„+i, then we clearly obtain the desired compatible 
transitions. Our immediate objective is to compare the McKean models 
associated with the two choices 

1 ) Sn,tf{Xn, •) = '^niv) 

2) Sn, r)„{Xn, •) = ^?n(®n) ^Xn "t" (1 ~ Gn{Xn)) '^niVn) 

In the first case, the selection transition Sn,t){xn,dyn) does not depend 
on the current location x„. As a result, the particle selects more often a 
new location even if it fits with the potential function. In this sense, this 
model contains more randomness than the second one. Also notice that the 
corresponding McKean measure is now a tensor product measiure 

Ktrt = ®n>0 

Under this new reference measure, the canonical process is again an inter- 
acting jump model of the form (2.25). The mutation remains the same, but 
the jump transition is a little more simple. Here the particle Xn selects 
randomly a new site X„ with the Boltzmann-Gibbs distribution ^niVn) 
associated with the law rjn of current state 
Next we examine the somehow degenerate situation where the potential 
functions are constant, = 1- In this case, the Feynman-Kac flow (2.20) 
represents the distributions of the random states X„ of the chain with 
Markov transitions M„ and we have rjn+i = VnMn+i- In the second model, 
the jiunp transition disappears and we have 

Sn,t){Xnidyn) = <Ji„(dj/n) ®nd = Mfi+i 

The corresponding McKean measiure is simply the distribution of the chain 
with Markov transitions M„+i 

* * * » ^n)) ~ Voi^dxo') JWr(xo,dxi) ... M,j(x,^_x,dxn) 

On the other hand, in the first case model we have 5„,,(x„, .) = t;„ 8md 
K„+i,,,(x„, .) = T]n+i- The McKean measure is now the tensor product of 
the distribution laws of the chain with transitions M„ 

K,^,„(d(xo,...,x„)) = Jjo(<fxo) T/i(dXi) ... T/„(dXn) 




2.5 Distribution Flows Models 77 



As we shall see in the further development of Chapter 3, these two models 
will have a similar interacting particle interpretation. We already mentioned 
that the particle interpretation of the first model coincides with the tradi- 
tional selection/mutation genetic algorithm. The particle model associated 
with the second McKean interpretation is numerically “more stable”. We 
will make this assertion precise in Chapter 9 with a comparison of the vari- 
ances in the central limi t theorems associated with these two models. This 
discussion seems to indicate that it is preferable to use the second McK- 
ean interpretations but this is always possible. To clarify this comment, 
we recall that we have made the noniimocent assumption that Gn takes 
values in (0, 1). This condition is crucial to define the second model but 
it is not essential in the first interpretation. Indeed, the Boltzmann-Gibbs 
distributions ^n(^n) are well-defined for any strictly positive functions Gn- 
Of course this situation can be embedded in the first one by replacing G„ 
by Gn/||Gn||. But if we do so, then running the corresponding particle al- 
gorithm we shall need to compute at each time the supremum norms ||Gn|| 
of the current potential function (at least on the current configuration). 

Next we relax condition Gn < 1 but still suppose that G„ is strictly 
positive. Since the potential functions are bounded, we can always find a 
nonnegative number Cn > 0 such that CnGn < 1. Arguing as usual, we 
check that for any tj € V{E„) we have = vSn,r) with the Markov 

transition 5„,, firom E„ into itself defined by 

^n,r>(®n) •) ~ "b (1 ~ ^n(^) (2.26) 

It is interesting to note that this formulation contains the two cases exam- 
ined above. The case 1) corresponds to the situation e„ = 0 and, whenever 
Gn < 1, case 2) corresponds to the choice e„ = 1. We also note that we 
can choose a parameter e„ = (niVn) > 0 that depends on the current 
distribution tj. For instance, we can choose 

l/«n(»/) = - ess - supGn 



so that (2.26) reads 



5„.,(x„, .) = + (l - — ) ^„(^) (2.27) 

rj-esssupGn \ r/-esssupGn/ 

We end this section with two alternative McKean interpretations. In the 
first one, we use the decomposition Gn = G;5; + G~ with 

Gn = Gn lGn>i and G“ = Gn lGn<l 



For a given distribution t] on En^ three situations may occur: 

1. If Tj(Gn > 1) = 0, then Gn = G” < 1 r/-almost surely. In particular, 
we have rj(G”) > 0, and the Boltzmann-Gibbs distribution '^niv) 




78 2. Peynman-Kac Formulae 



En associated with the potential function is well-defined. More- 
over we have ^n{v) ~ ^niv) = vSn,ri with the Markov transition 

^n,ij(®n) •) = (®n) <^i„ + (1 ~ G'n (®n)) '®’n iv) 

2. If T]{Gn < 1) = 0, then G„ = G+ > 1 rj-almost surely. In particular, 
we have rj{G^) > 1, and the Boltzmann-Gibbs distribution ( 7 ;) on 
E„ associated with the potentid function G^ is well-defined. Further- 
more, we prove easily that 9n{fj} — ^tiv) = with the Markov 
transition 






3. Finally, if rf{Gn < 1) A r/(Gn > 1) > 0, we can use for instance the 
decomposition 



»n(l) 



l(Gj) 

7/(Gn + Gn ) 






y{Gn) 

V{Gn +Gn) 



iv) “ V^n.rj 



with the Markov transition defined by 



Sn,ri 



riiOt) 0 + . 1 ( 0 ;:) c- 

fi(ai + a^) 7,(0* + g;) 



When the potential G„ = exp Vi, is related to some (bounded) energy 
function Vi,, we can use the decomposition 

G„ = G+ G- with G+ = e’'" >1 and G~ = e"'"" < 1 

where and (-V^ ) are the positive and negative parts of Vi,. By the 
multiplicative form of the decomposition above, we clearly have the formu- 
lae 

® =z O o 

where and denote the Boltzmann-Gibbs transformations associated 
with the potentisJ functions G+ and G^. FYom these observations, we can 
decompose the mapping rj -¥ ^niv) two different ways. We can use for 
instance the decomposition 



St., 



^ Vn = VSt 



n,T) 






= ^niv) 



V 




2.5 Distribution Flows Models 79 



2.5.4 Kalman- Bucy filters 

The measure-valued equations presented in Proposition 2.5.2 can rarely be 
solved analytically and reciursively in time, except in some particular situa- 
tions. When the state spaces are finite, with a reasonably small cardinality, 
the integral operators reduce to finite sums, and the solution reduces to 
simple algebraic computations. Another generic situation where an explicit 
solution exists is known in filtering literature as the linear/Gaussian filter- 
ing problem. Rather than rederiving rigorously the optimal Kalman-Bucy 
equations from the start, we provide in this section a short and informaJ 
way to obtain these explicit solutions. For a more detailed discussion of 
linear filtering problems, the reader is referred to the pioneering articles of 
R.E. Kalman and R.S. Bucy (196, 197] . A rigorous derivation of extended 
Kalman-Bucy solutions, and more recent developments, can be found in 
textbook by A.N. Shiryaev [287]. 

We consider a R*’"''’-valued Markov chain (X„, y„) defined by the recur- 
sive relations 

f Xji ~ Afi A,^_i "i" dfi -l- , n ^ 1 9 q\ 

\ Fn = C^X^ + Cn + Dr^V,,, n>0 

for some R**" and R**” -valued independent random sequences W„ and Vn, 
independent of Xq, some matrices A„, B„, C„, Dn with appropriate dimen- 
sions and finally some (p + 9 )-dimensional vector (on,Cn). We further as- 
sume that Wn and Vn centered Gaussian random sequences with covariance 
matrices jRJ(, and Xo is a Gaussian random variable in R** with a mean 
{ind covariance matrix denoted by 

Xo” = E(Xo) and Pq" = E((^o - E(Xo)) (Xq - E(Xo))') 

In the further development of this section we shall denote by A/’(m, R) a 
Gaussian distribution a d-dimensional space R** with mean vector m 6 R** 
and covariance matrix R € 

A/'em, R){dx) = [-2'Ha: - m)R-^{x - m)'| dx 



We also fix a sequence of observations Y = y and we introduce the non 
homogeneous potential/transitions {GnyMn) defined as follows: 



# We let Gn : (0,oo) be the defined by the Radon-Nykodim 

derivative 



Gn (^n) 



dJ^(Cfx Xfi -^n-Rn-Pn) r \ 



• We let Mn+i be the Gaussian transition on R^ defined by 

Afn-f 1 (^ri) dXn-f-l) — Xn "f" Ufi+l ? 




80 2. Feymnan-Kac Formulae 



To have a well-defined pair of potentials/kemels (G„, M„), we have implic- 
itly assumed that the covariance matrices are non degenerate. The distri- 
bution fiow defined for any / € Bb(R*’) by the Feymnan-Kac formulae 



n— 1 

»?n(/) = 7n(/)/7n(l) with 7n(/) = E(/(Xn) I] W) (2.29) 

p=0 

and their updated versions rjn, represent respectively, the one-step predic- 
tors and the optimal filters; that is, we have that 

= Law(X„ |y[o.n-i] = (j/o, • • • , yn-i)) 
fin = Law(X„ I y[o,„) = (yo, • • • , yn)) 

with yjo.n) = {Yn)o<p<n- Under our assumptions, rjn and fjn are Gaussian 
distributions 



r,n=mX-,P-) and 

The synthesis of the conditional mean and covariance matrices is carried 
out using the traditional Kalman-Bucy recursive equations. A short and 
slightly abusive way to derive these recursions is as follows. To find the 
prediction step we simply observe that 

-^n+l “ E(An+l -I- a„+i -I- B„+1 Wn+1 I Tio.n] = (yo> • • • > Vn)) 

~ ^n+l^n "I" On+1 



and 

= E((A„+i(X„ - Xn) + Bn+lWn+l){An+l{Xn ~ X„) + Bn+lWn+l)') 

= A„+i Pn -|- Bn+1 PJJ’+i ^n+1 
In summary, we have proved the 

Lemma 2.5.1 For any (m, P) 6 (R** x and n>l the linear predic- 
tion step is given by the Markov transport equation 

M{m,P)Mn=M{mn,Pn) 

with the mean vector € R** and covariance matrix Pn G R‘*’“* 

m„ = A„m -I- o„ 

Pn = AnP A'„-\-BnR^ B'n 




2.6 Feynman-Kac Models in Random Media 81 



The updating step is partly based on the fact that the >^-martingale dif- 
ference (Xn — ) has the representation property with respect to the 

innovation process; that is, we have 

Xn-x- = G„ (r„-P„-) 

for some gain matrix Gn, and where 

9- = E(y„|y( 0 .n-Ii) = c„x- + C„ 

Since we have E((X„ - X„)(y„ - f-)') = 0, and 

(y„ - y-) = C„(X„ - X-) + Dr^Vn 



we 6nd that 

E((x„ - x-)(y„ - y-)0 = G„E((y„ - y-)(y„ - 9-y) 

We conclude that G„ = Finally, using the 

decomposition - X„ = (X„ - X~) + (X~ - X„) and by symmetry 
argument, we conclude that 

p„ = p--E((x--x„)(x--x„)0 

= P- - G„E((y„ - y-)(y„ - y-)')G; = p- - g„c„p- 

In summary, we have proved the 

Lemma 2.5.2 For any (m, P) € (R** x RP^p) and n>0 the linear predic- 
tion step is given by the Markov transport equation 

9r,iAr{m,P))=Af{mn,Pn) 

with the mean vector mn € R** and covariance matrix P„ € 

m„ = m + GniVn - {Cnm + c„)) 

Pn = (/-G„C„)P 

with the filter gain matrix G„ = PC^(C„PC^ + 

Finally observe that whenever = 0 = Cn are null matrices then the 
potential functions are constant Gn and the Feynman-Kac flow reduces to 

»7n = »7n = Law(X„) = Af{an, Bn Rn B'J 

2.6 Feynman-Kac Models in Random Media 

In this section, we discuss a modeling technique to represent Feynman-Kac 
formulae in random media. This new level of randomness has diflFerent in- 
terpretations. In physics, the randomness usually appears in the description 




82 2. Feynman-Kac Formulae 



of a given absorbing medium. This rather traditional point of view consists 
in considering random potential functions. In some other instances, such as 
in filtering problems, the randomness rather comes from a realization of an 
auxiliary process that influences the evolution of a reference signal. In this 
situation, the potential functions and the Markov motions are both ran- 
dom. One way of treating these two cases is to consider a nonhomogeneous 
Markov chain with two components 



where (En\si*^), t = 1, 2, is a pair of measurable spaces. The first and sec- 
ond components represent respectively the random variation of the medium 
and the “reference” particle motion. We further require that its Markov 
transitions from En-i into En have the form 

•^n((®n-l) 1/n— l)) 1/n)) ~ ■^n ^(®n— li^n) 

where and are Markov transitions from Ej^^i into E„^ with 

t = 1, 2. Finally, we assume that the distribution of Xq is given by 

rfo(d(xo,yo)) = rii^\dxo) Vxloid’Vo) 

with G V{En^) and G V{En^). By construction the medium is 
changing randomly at each time but is not influenced by the evolution of 
the particle. In the reverse angle, the particle transitions depend on the 
ciurent value of the medium. 

Before entering into the precise definition of the Feynman-Kac models 
associated with this pair Markov chain, it is convenient to fix some notation 
and to describe with some precision the law of the quenched Markov particle 
motions. By construction, the nth time marginals Puo.n of the law 
associated with the canonical Markov chain Xn are given by 

l/o)i • • ■ ) (®n> S/n))) 

= Vo{d{xo, »o))M„((io, yo),d{xi , yi)) . . . y„_i), d(x„, y„)) 

By direct inspection, we see that XMs a Markov chain with transitions 
Mn^ and initial distribution 'n^\ In other words, the distribution P^*,h 

% ,n 

of path {Xq,. . . from 0 up to time n of the first component is given 
by 

P^ (1) {d{xo, . . . , x„)) = rio\dxo) m\^\xo, dxi) . . . Af(^)(x„_i, dx„) 

Vo 

We conclude that for any realization of the medium 

X = (x „)„>0 e n 4'^ 

n>0 




2.6 Feynman-Kac Models in Random Media 83 



we have the synthetic integral formula 

®**?o.n(<f((®0>yo)) • • • 1 (®n>yn))) = • • • )®n)) • • •>!/»»)) 

t/q I J 

with 



•••.»»»)) = »?So(‘^yo) Mi*)i(yo,dyi) . . . M^%{yn-i,dyn) 



^«(2) 



( 2 ) 



In other words, given = x, the random sequence is a Markov chain 
with transitions and initial distribution rf'xofi' 

In the further development of this section, we denote by E,^(.) the ex- 
pectation with respect to the law Pvo of the Markov chmn We also 
simplify notation and we write £[,](.) instead of Ej*|(.) for the expecta- 
tion with respect to the conditional distribution of the Markov chain 
with respect to a realization X* = x of the medium. 



2.6.1 Quenched and Annealed Feynman-Kac Flows 

Let Gn : y^En^) (0, oo) be a given collection of bounded measurable 

functions. We notice that the Feynman-Kac path measures (2.7) associated 
with the pair {Gn, Mn) with initial distribution rjo are defined by 



Q*Jo.n(d((xo, yo)i • • • ) (®n» yn))) 



1 

Zn 



^ n vp) I 



»(i) 



(1) (d(® 0 ) • • • ) ®n)) Pm n(*^(yO> • • • 1 yn)) 

tJq ,n i j’ 



with the normalizing constant Zn = E,jo(Ilp=:o ^p(-^p> -^p) ) > 0. 

Definition 2.6.1 • The quenched Feynman-Kac path measures asso- 

ciated with a realization X* = x one defined by the formulae 



l.p=o J 



(d(yo,...,yn)) 



with normalizing constants = E[,](r[p=o ^p(®p>^p) ) > 0. 

• The annealed path measures are defined by the synthetic integral for- 
mula 

Q7o,n(<i(y0,---jyn)) 




84 2. Feynman-Kac Formulae 



Note that the quenched quantities (Q[*],m ^[x|n> ^N.n) only depend on the 
path (xo, . . . , Xn) from the origin up to time n. To clarify the presentation, 
we denote by Gxp,p(.) the ‘frandom” potential functions defined by 

Gxn.n • Vn ^ 42 ) ^ Ox„,n{yn) — Gn{Xn,yn) € (0, CXj) 

Definition 2.6.2 Given a realization = x, the quenched Feynman-Kac 
distributions on En^ are defined for any fn € Bb{En^) by 



-rS J/J = E|.| Uxi) n 

\ p=0 



(2.30) 



The annealed Feynman-Kac distributions on En^ are defined for any 
fn € by 



l^nHfn) = J[ G,{X,) j (2.31) 

Next we have collected two important properties related to the normal- 
ized and unnormalized quenched measures. These properties are simple 
consequences of results presented in Section 2.3 and Section 2.5.2. Their 
complete proofs are left to the reader. Using Proposition 2.3.1, we prove 
the following result. 

Proposition 2.6.1 For any n > 1, /n € Bb{En^) and, given = x, we 
have the representation formula 

7[3n(/n)= »?[x],„(/n) ><11 f / („ »7(x],p(<^l/) 

p_0 L JE„ J 

The dynamical structure of the quenched Feymntin-Kac model is defined in 
terms of a pair of random updating/prediction mechanisms. More precisely, 
using the same line of argument as the one given in Section 2.5.2, we prove 
the following result. 

Proposition 2.6.2 The quenched distribution flow ^ satisfies the non- 
linear equation 

’llx/.n ~ ®n)> »l[j,j*^„_l) (2.32) 

The one-step mappings 

(£^i-l X V 'P(^^i-l) — > ^(^^n^) 

{{u,v),T}) >• i^\{u,v),T]) = ^„-i,uiv) 




2.6 Feynman-Kac Models in Random Media 85 



are defined in terms of the mappings • ViEn ) V{En ) given by 

1 



^n,u{'n){^yn) — 






GuyTiiVn) Vi^yn) 



In some instances, the quenched distribution flow has a nice ex- 
plicit description. For instance, given the first component of a pair 
signal (X^,X^), the signal/observation (X^,y) may be a traditional lin- 
ear/Gaussian model. In this case, the nonlinear quenched equations are 
solved by the traditional Kalman-Bucy recursive formulae (see Section 2.5.4 
and Section 12.6.7). We recommend the interested reader to derive the 
quenched Kalman-Bucy solution of the filtering problem defined as in (2.28), 
replacing by (o„,c„) by (a„(Xi),c„(X;i)), and (i4„, C„, £>„) 

by some functions 



2.6.2 Feynman-Kac Models in Distribution Space 

Our next objective is to introduce a Feynman-Kac model in distribution 
space that connects the quenched and annealed measmes. To this end, we 
introduce the stochastic sequence 

x; = {Xi J € £;; = x (2.33) 

Using the recursive equations (2.32), we prove the following result. 

Proposition 2.6.3 The stochastic sequence X'„ is a ’Markov chain 
with transitions defined for any f^ € Bb{E'ri) ^ ^2/ 

A^A(/n)(“.»?) = f ... 

and initial distribution tjq € 'P{Eq) = V{E^^ x V(E^^)) defined by 
Vo{d{x, u)) = v^^\dx) S , 2 ) {dv) 

^/x,0 

Proof: 

For any e Bb(En), we have 

where Fn stands for the tr-algebra generated by the random variables X' , 
p < n. Recalling that X^_j = (X;[_i,p|^\j we conclude that 

I ^n-l) 

= E,„(/;(x;)|x;_i) 




86 2. Feynman-Kac Formulae 



This clearly ends the proof of the proposition. ■ 

Similar arguments can be used to check that is a Markov chain with 
respect to the distribution (I, . We associate with the Markov chain 

Vo 

the distribution flow on E'^ deflned for any /' € Bb{E'^) by the Feynman- 
Kac formulae 



rinUn)=in{fn)hni}) with U 

\ p=0 



(2.34) 

with the “annealed” potential functions 
G'„ : (x,/i) = f fi{dy) G„{x,y) € (0,oo) (2.35) 

Arguing as above, we And that the flow satisfles the recursive equation 

Vn = KiVn-l) (2-36) 

with the one-step mappings 

K - nK-i) nK) 

V Kiv) = K-iiv)K 

The updating transitions ViEn) are now given by 

K{v){fn) = V{G'nf:^)h{G'r,) 

In the next proposition, we show that r)'„ contains sJl information on the 
annealed distributions. 

Proposition 2.6.4 For any functions f'„ € Bb{E'„), /„ G Bb(En^), and 
hn € Bb(En^), we have 

fnM = Vifn) 7;(/A)=E,^(7fx\,,„(/n)) 

7f»(/n) ~ 7n*^(^n) Vnifn)~^nH^n) 

Proof: 

The proof of the first assertion is based on the fact that, for any G BbiE'^) 

of the form /^(x,t;) = ;?(/„) with /„ G Bb{En^), we have 



Inifn) = K, fniK) H 

\ p=0 



\ P=0 







2.7 Feynman-Kac Semigroups 87 



Using Proposition 2.6.1, we find that 
and 

This ends the proof of the first assertion. The second implication is proved 
similarly by noting that, for any € Bb(E^) of the form f^{x,rf) = hn{x) 
with hn € Bb{En^), we have 

I'M'.) = e.»(mxJ)E|x. 

= E.„(MJc;)n 

\ P=0 

The end of the proof is now straightforward. ■ 



at 



l( ) (by Prop. 2.6.1) 






2.7 Feynman-Kac Semigroups 

This section focuses on the semigroup structure of discrete time Feynman- 
Kac models. We describe these structural properties in an abstract nonho- 
mogeneous framework. We first recall some notation relative to the Markov 
chain and the potential functions with which the models are built. 

We let {EnjSn)y n G N, be a collection of measurable spaces. We consider 
an arbitrary probability measure on Eq and a collection of Markov tran- 
sitions Mn-i-i(xn,dxn-|.i) from En into £?n+i* We associate with these ob- 
jects a nonhomogeneous Markov chain (Xn)n€Nj P.K.) taking values 
at each time n on E„ with initial distribution t)o and elementary transitions 
Mn+i from En into £^„+i. 

For any p € N and Xp € Ep, we denote by Pp,*p the probability distribu- 
tion of the shifted chain (Xp.|.„)„>o starting at Xp and we use the notation 
Ep,ip(.) for the expectation with respect to this law. In this notation, we 
have for instance for any bounded measurable function /p,„ on £?[p+i,p.,.„] 

^p,xAfp,n{Xp+l, . . . ,Xp+n)) 

— I /p^n(3Jp+l, . . • , Xp+n) lWp+l(3^p, dXp) . . . Afp^-n(35p+n— 1 , dXp+n) 

When p = 0, sometimes we slightly abuse the notation and write Pxo 8md 
Eld iiistead of Po,io and Eo.io- Given a distribution pp € V{Ep), we use 




88 2. Feynman-Kac Formulae 



the notation for the expectation with respect to the measure 

JEp 

In the further development of this section, we denote respectively by x„, 
/„, and fin a point in a bounded and measmable function on En, and 
a probability measure on 

Definition 2.7.1 We denote by Afp,„, 0 < p < n, the linear semigroup 
associated with the Markov kernels M„ and defined by 

Mp^n ~ -Wp+liWp+2 • • • Mn 

We use the convention M„,n = Id forp = n. This semigroup is alternatively 
defined by 

Mp,n{fn){Xp)=Ep,,p{fn{Xn)) 

Let Gn '• En (0, oo), n > 0, be a collection of bounded potential func- 
tions. The Feynman-Kac prediction model € ViEn) associated with 
{Gn, Mn) is defined by the formulae 

/ n-l 

»?n(/n)=7n(/n)/7n(l) with 7n(/n) = E,^ /„(X„) J] Gp(Xp) 

V P=o 

We also recall that the updated models are given by 

^n(/n) = 7n(/n)/7n(l) with 7n(/„) = 7n(/nGn) 



2.7.1 Prediction Semigroups 

The study of the dynamical structme of 7 „ and rjn was initiated in Sec- 
tion 2.5. We recall that 7« satisfies the recmrsive equation 

7n+l ~ 7nQn+l with Qn+l{fn+l) — Gn -Wn+l(/n+l) ^d 70 ~ % 



Definition 2.7.2 We denote by Qp,n, 0 < p < n, the linear semigroup 
associated with 7 „ and defined by 

Qp,n — Qp+iQj>+ 2 • • • Qn 

We use the convention Q„,„ = Id forp = n. This semigroup is alternatively 
defined by the Feynman-Kac formulae 

Qp.n(/n)(Xp) = Ep.,^ (fn{Xn) H G,(^p) ) 

\ 9=P / 




2.7 Feynman-Kac Semigroups 89 



By the definition of r/n and we readily observe that 

n (f \ _ 7p(^Pi^(/y>)) _ VpiQp.njfn)) 

7n(l) 7p(Qp,n(l)) »?p(Qp.n(l)) 

This representation leads to the following definition. 

Definition 2.7.3 Wt denote by ^p,n, 0 < p < n, the linear semigroup 
associated with and defined by 



%,n = ^n° ^n-1 ° • • • O ^p+1 

H^e use the convention #n,n = Idforp = n. This semigroup is alternatively 
defined by the Feynman-Kac formulae 

* IhiQp.nifn)) _ ®i>,Mp(/n(-^n) G,(Xp)) 

Mp(Qp.n(D) Ep,pM::Ig,{Xp)) 

The preceding models will be of constant use in the next chapters. In the 
second part of this section, we enter more deeply into the dynamical struc- 
ture of these models. The forthcoming analysis will be used in Chapter 4 
when we study the stability properties of Feynman-Kac semigroups. To de- 
scribe the fine structure of it is convenient to introduce the following 
objects. 

Definition 2.7.4 For any 0 < p < n, we denote respectively by Gp,n : 
Ep -¥ (0, oo) and Pp,„ the potential functions on Ep and the Markov kernels 
ftvm Ep into E„ defined by 

(^p,n ~ Qp,n(l) and Pp,n{fn) ~ Qp,nifn)/Qp,ni^) 

The next proposition expresses the fact that all the mappings $p,„ have 
the same updating/prediction nature. 

Proposition 2.7.1 The mappings $p,„ : F{Ep) V{En), 0 < p < n, 
satisfy the formula 

^p,n{Pp) = ^p,n{Pp)Pp,n ( 2 . 37 ) 

with the Boltzmann-Gibbs transformation ^p,„ from V{Ep) into itself de- 
fined by 

^p,n{Pp){fp) = f^p{Gp,nfp)IPp{Gp,n) 

In addition, the pairs (Gp,„, Pp,n)p<n satisfy the backward recursive equa- 
tions 



Gp,n — Gp X Afp+i(Gp+i,n) nnd Pp,n — Pp+i-^+l.n 
witii the Markov kernels iZp”^ from Pp_i into Ep, 1 < p > n, defined by 
p(")(/p) = Mp(Gp.„/p)/Mp(Gp,„) 




90 2. Feynman-Kac Formulae 



Proof: 

The proof of the first assertion is a simple consequence of the definition of 
Gp,n, Pp,n> and ^p,n- To prove the inductive formulae, we simply note that 

Gp,n = Qp,n(l) = Qp,p+l(Qp+l,n(l)] = Gp Mp+i(Gp+i,n) 

To prove the second one, we observe that 

p (f\ _ Qp,p+lQp+l,n{fn) _ Mp-|-i[Qp+i,n(/n)] 

= Mp+i[Gp+i,n Pp4.i^n(/n)]/-Wp+i[Gp+i,n] 

The end of the proof is now straightforward. ■ 

One important consequence of the proposition above is that Pp,„ can be 
regarded as the transition from Ep into of a nonhomogeneous Markov 
chain with elementary transitions p < q < n. More precisely, using 
Proposition 2.7.1, we find that 

_ p(»») p{”) p(n) 

Definition 2.7.5 For any n € N, u>e denote by (-Rp?fl)o<p< 9 <n, the linear 
semigroup associated with the Markov kernels (Pp"^)o<p<n, and defined by 

We use the convention jRp^p = Id for p = q- 

In this notation, we note that = Mn and R^n = ip,n- We also quote 
the following technical lemma. 

Lemma 2.7.1 For any 0 <p < q <n, we have 

Pp,n = P^qPqyn R^qifq) = Qp^qilq^q^n) /Qp,q{Gq^n) 

This semigroup is alternatively defined by the Feynman-Kac formulae 

Proof: 

By the definition of (Pp,n,Gp,„), we have 

p ( f \ _ Qp,n{fn) Qp,qQq,n{fn) _ Qp,q{Gq^nPq,n{fn)) 

Qp,n(l) Qp.,Q,.n(l) Qp,q{Gq,n) 

We conclude that Pp,n = I^qPq,n with 

= Qp,qiGq,nfq)/Qp.q{Gq.n) 




2.7 Feynman-Kac Semigroups 91 



To check the Feynman-Kac formulation of the semigroup, we observe that 

Qp>(li^qyrifq){^p) = Qp,q{Qq,n{^) fq){^p) 



q-l 



= Ep,,, Q,.n(l)(X,)/,(X,)nGfc(Xfc) 



k=p 



fn-1 \ 9-1 

= Ep.,, I E,.x, ( n fM n GkiXk) 

l=q I k=p 



Using the Markov property, we conclude that 



n-1 



Qp.,(G,,n/,)(xp) = E,,,^\f,{X,)Y[Gk{Xk) 

k=p 



The proof of the lemma is now completed. ■ 

Next we introduce another semigroup that appears to be useful in the 
analysis of the fluctuations of particle models. 

Definition 2.7.6 We denote by Qp,n> 0 < p < n, the “normalized” inte- 
gral operator Qp+i = defined by 

Qp,n = Qp+lQp+2 • • • Qn 

We use the convention Qn.n = Id for p = n. This semigroup is alternatively 
defined by the formulae 



~ 7n(l) ~ 

2.7.2 Updated Semigroups 

In this final section, we give a brief discussion on the semigroup structure of 
the updated models % and ^n- Arguing as in Section 2.5, we immediately 
notice that % satisfies the recursive equation 

7n+l — 7nQn+l with Qn+l(/n+l) ~ Affi+l(/n+l) 

and 7o(dxo) = Go{xo)‘no{dxo), with the pair potentials/kernels (G„,M„) 
defined in Proposition 2.5.2. From the discussion given in Section 2.4.3, 
the representations above show that the updated flow models associated 
with the pair (G„, M„) can be regarded as the prediction flows associated 
with the pair (G„, M„). We conclude that the semigroup structure of the 




92 2. Feynman>Kac Formulae 



updated models % and rjn is defined as the one given above by replacing 
the pair {GnyMn) by (Gn,Afn). We use the superscript (.) to denote the 
corresponding objects. In this notation, the semigroups of % and fjn are 
respectively defined by 

Qp,n = Qp+lQp+2 -Qn and $p,„ = $„ O O . . . O $p+i 
We use the conventions Qn,n = Id = $n,n for p = n. We also have 



^P,n(Mp)(/n) 



t^p{Qp,n{fn)) __ /^(^P,n-Pp,n(/n)) 
^p{Qp>nW) /^p(^p,n) 



with the backward formulae 



Gp^fi — Gp Mp^i{Gp^i^n) s^d 

where is the Markov kernel from Ep^i 



P — P 

^p,n — 

into Epj p > 1, defined by 



^”)(/p) = Mp(ap,„/p)/Mp(Gp,„) 



These semigroups also appear to be useful in the antdysis of Feynman- 
Kac models associated with not necessarily strictly positive potentials. Let 
En = G^*(0,oo) be the support of the potential function Gn and, we as- 
sume that the accessibility condition (>1) introduced on page 67 is met. 
Rephrasing the results given in Section 2.4.3, the main advantage of this 
condition is ^hat we can restrict the whole semigroup analysis on the state 
spaces {En,€„). We recall that in this case Mn+i is a well-defined Markov 
kernel from En into f^n+i and Gn are now strictly positive potentials on 
En- Another important comment in practice is that the potential functions 
Gn are usually not homogeneous with respect to the time parameter, and 
the subsets En may vary. This shows that we do need to consider nonho- 
mogeneous state-space models. 

FVom the discussions above, it is also clear that the analysis of updated 
models is reduced to the analysis of prediction ones by a suitable choice of 
potentials and Markov kernels. 

Nevertheless, in some situations it is more judicious to work directly with 
the particular Feynman-Kac structure of the updated semigroups. We can 
use alternatively one of the representations 



Qp,n(/n)(Xp) = Ep., I/„(A„) Yl Gfc(Xfc) 

\ fc=P+l 

= Ep,, f/„(x„)nafc(Xfc) 




2.7 Feynman-Kac Semigroups 93 



where (reap. Ep,*p(.)) represents the expectation with respect to 

the law Pp,ip(.)> respectively Pp,ip, of the shifted Markov chain {Xn)n>p 
starting at Xp at time n — p with elementary transitions Af„ (resp. M„). 




3 

Genealogical and Interacting Particle 
Models 



3.1 Introduction 

This chapter is devoted to particle interpretations of Feynman-Kac mod- 
els. Prom the discussion given in the introduction, these particle models 
can be sought in many different ways, depending on the application we 
have in mind. For this reason, we have chosen to describe these models 
as an abstract stochastic linearization technique for solving nonlinear and 
measure-valued equations. The second part of this chapter is concerned 
with genealogical tree-based particle interpretations of Feynman-Kac mod- 
els in path space. 

In the further development of this chapter and unless otherwise stated, we 
will assmne that the potential functions are boimded and strictly positive. 
To simphfy the presentation, we only discuss the particle model associated 
with a McKean interpretation of the prediction flow. We recall that the 
Feynman-Kac prediction model rjn € V{En) is deflned for any fn G Bb{En) 
by the equation 

/ n-l 

Vn{fn) = 7n(/n)/7n(l) with 7n(/n) = I /n(X^) JJ G^p(-^p) 

V p=0 

The process Xn is a nonhomogeneous and F?n*valued Markov chain with 
Markov transitions Mn from En^i into E^, and (?n is a given collection of 
(strictly positive) i^n-measurable potential functions on En- The situation 
where Gn may take null values can be reduced to this situation, under ap- 
propriate accessibility conditions, by replacing tjn by the updated model rfn 




96 3. Genealogical and Interacting Particle Models 



(see (2.16), Section 2.5 and Proposition 2.4.3, page 67). For the convenience 
of the reader, we recall that r;„ satisfy a nonlinear recursive equation 



Vn+l — Vn^n+l,t)„ 



where Kn,r\ is a nonunique collection of Markov kernels from E„ into En+l 
satisfying the compatibility condition 



= $„(f/)M„+i with ’J„(t/)(dx) 



1 

V{Gn) 



Gn{x) T){dx) 



For a more detadled description as well as several worked-out examples of 
kernels Kn,t), we refer the reader to Section 2.5.2 and Section 2.5.3. In this 
chapter, we design an abstract strategy to associate with a given collection 
of compatible kernels Kn,r) a sequence of interacting particle approximation 
models. We will not describe the whole class of models associated with 
all the McKean interpretations discussed in Section 2.5.3; that would of 
course be too much digression. We leave the interested reader to derive 
the corresponding particle interpretations. In this chapter, we have chosen 
to concentrate the discussion on the somewhat generic situation where the 
kernels Kn,rf are a combination of a selection and mutation transition 



Kn+l,ij — Sn,r)^n+1 



(3.1) 



The selection tremsition 5n,t> on is given by 

Sn,ij(Xfi, .) = Cn^n(^n) ^Zn "i" (1 ~ *f»G^n(®n)) ^n(^) 

where €„ > 0 stands for any nonnegative number such that e„Gn < 1. 



3.2 Interacting Particle Interpretations 

In this section, we associate with a given collection of Markov transitions a 
sequence of iV-interacting particle systems. We will examine in more detail 
the evolution of the particle model associated with the McKean interpre- 
tation (3.1) and we will also compare the two cases €« = 0 and e„ > 0. At 
the end of this section, we discuss the situation where G„ is not necessarily 
strictly positive. 

Definition 3.2.1 The interacting particle model associated with a collec- 
tion of Markov transitions K^.^, r\ 6 V{En), n > 1, and with an initial 
distribution t/o € V{Eq) is a sequence of nonhomogeneous Markov chains 




3.2 Interacting Particle Interpretations 97 



taking values at each time n£N in the product space E^, that is 




N timt* 



The initial configuration consists of N independent and identically dis- 
tributed random variables with common law Its elementary transitions 
from into Ejf are given in a symbolic integral form by 



<(«<">€ <ix„ I d-i) = n 



P=1 



where dx„ = dx\ x ... x is an infinitesimal neighborhood of a point 
Xn = {x\,...,x^^)€Elf. 

As traditionally, when there is no possible confusion, we simplify notar 
tion and suppress the index so that we write (^m^n) instead of 

To clarify the presentation we slightly abuse the notation 
and sometimes write 

1 ^ 
i=l 

for each Xn = (^n)i<»<A^ ^ simplified notation, the elementary 

transition (3.2) reads 



N 

K e dx„ I $„_1) = n ^nMU-^)ien-vdx^n) 

P=1 

The parameter iV > 1 is called the size of the system or the precision 
parameter of the particle algorithm. The iV-tuple represents the config- 
uration of the system at time n of N particles The iV-particle model 
associated with the Markov transitions given by (3.1) is the Markov 
chain with elementary transitions 

N 

^^0 (^n+1 € cten+1 I ^n) = n (3.3) 

P=1 



By direct inspection, we see that 



p^(e„+i€dx„+iu„)= / 

Jen 



^ni^nydXn) A^n+l(^n» ^n+l) 




98 3. Genealogical and Interacting Particle Models 



with the Boltzmann-Gibbs transition Sn from Ej^ into itself and the mu- 
tation transition Adn+i from into E(^^i defined by 



N 

dXfi) = JJ ‘^n,m(^n)(^n» 

p=l 

N 

*Adn+l (®n ) ^n+1 ) = n «,<&;«) 

P =1 

Loosely speaking, this integral decomposition shows that this particle model 
has the same updating/prediction nature as that of the “limiting” Feynman- 
Kac model. More precisely, the deterministic two-step updating/prediction 
transitions in distribution spaces 

^ . ^n+1 ^ 

Vn € ^(-®n) ~ ^ Vn ~ Vn^n,ri„ ^ ^ ^n+1 ~ Vn^n+1 



have been replaced a two-step selection/mutation transition in product 
spaces 



e^€E^ 



selection 



AT 



mutation 



Next we describe in more detail the motion of the particles in terms of the 
pair trsmsitions (^n.tp M„). 



• Selection: The selection stage only depends on the current potential 
function Gn and the parameter More precisely, given the config- 
uration ^ of the system at time n, the selection transition 
consists in selecting randomly N particles with respective distri- 
bution 



SnMU)i^n, .) = (nGniCn) % + (1 ~ ^nGniO) ^n(m(^„)) 



In other words, with a probability €„Gn(^^), we set ^‘ = otherwise 
we select randomly a particle with distribution 






Y' <?n(C) f 
h Ef-l <5n({ii) 



and we set 



• Mutation: The mutation stage only dep>ends on the Markov kernel 
Mn+i- During this stage, each selected particle evolves randomly 
according to the Markov transition Mn+i- In other words, given the 
selected configmration € E^, the mutation transition consists in 
sampling randomly N independent random particles with re- 
spective distributions •)• 




3.3 Particle models with Degenerate Potential 99 



FVom the discussion above, we see that the case e„ = 0 corresponds to 
the so-called "simple genetic algorithm” with mutation and proportional 
selections. Also notice that in this situation the particle model consists of 
conditionally iV-independent particles. More precisely, in this situation the 
elementary transitions (3.3) can be rewritten as 

N 

Un) = n (3.4) 

p=l 

where $n+i : 'P{En) 'P{En+i) is the one-step mapping of the flow »/„ 
defined by 

K+liv) = ^nivWn+l 



3.3 Particle models with Degenerate Potential 

As promised in the introduction, this section discusses the situation where 
Gn is not strictly positive. To avoid some unnecessary discussions on de- 
generate situations, we suppose the accessibility condition (>t) introduced 
in (2.16) on page 67 is met so that the limiting Feynman-Kac model is 
well-defined at any time. 

Two strategies can be underlined. In view of the discussion given in 
Section 2.5.2, the first idea is to consider the Af-particle approximation 
model associated with some McKean interpretation of the updated model. 
To be more precise, we recall that the updated measures fjn = 
can be regarded as a sequence of measures on En with En = G”^(0,oo). 
Furthermore, ffn coincide with the prediction^mo^l starting at rfo and 
associated with the pair of potent ials/kemels (Gn> Afn) on the state spaces 
En and defined in Proposition 2.5.2, page 70. 

One advantage of this interpretation is that th^ potential Gn is now a 
strictly positive function on the restricted space By symmetry argu- 
ments, we prove that the updated model % G V{En) satisfies the recursive 
equation 

^n+l “ Vn^n-\-l,r)rx with Kn-^l^rj ~ Sn^rj^n-^l 

The selection transitions Sn,ri are now Markov kernels from En into itself 
and they are defined for any x„ € En by 

Sn,rf{^n^ dyn) — ^n^^ni^n) ^Xn(dyn) "I" (I ~ ^n^n(^n)) ’^n(*?)(<fj/n) 

The Boltzmann-Gibbs transformation : V(En) V{En) associated 
with Gn is given for any r) € ViEn) by 

$„(r,)(dx„) = GniXn) Vidxn) 

ViGn) 




100 3. Genealogical and Interacting Particle Models 



In this interpretation, the model rfn satisfies the deterministic updating/pre- 
diction transitions 

^ updating ^ ^ prediction ^ 

Vn € 'P{En) ^ Vn — *?n^n,9n ^ ^ Vn+1 ~ Vn^n+1 

The iV^-particle associated with this McKean interpretation is defined along 
the same lines as before. It is interesting to observe that in this case the 
mutation transitions of the particles depend on the current potential func- 
tions. This aspect of the peuiiicle model is particularly useful in filtering 
problems (see Section 12.6). In this context, each elementary mutation de- 
pends on the likelihood function of the particle with respect to the current 
observation delivered by the sensors. This intuitively indicates that the re- 
sulting algorithm has better “tracking properties” than the one with “free 
mutations”. We also mention th^for regular potential functions the sam- 
pling of transitions according to can be performed by using for instance 
an acceptance/rejection simulation technique. More precisely, when G„ has 
botmded relative oscillations, we have that 

M„(x„_i,din) < sup M„(x„_i,din) 



This shows that we can sample elementary Mn-transitions with a tradi> 
tional acceptance/rejection mechanism based on independent Mn-samples. 

An alternative way to produce these potential dependent transitions con- 
sists in adding another level of randonmess. We proceed as follows. Suppose 
we want to simulate the elementary Afn-transition of a particle, say Cn-i* 
Then we first produce a collection of N' auxiliary and independent tran- 
sitions , 1 < t' < iV', according to .)• The rationale 

behind this is that for suflSciently large iV' we have in some sense 

1 

Mn(a_l,dXn) ^ Mf (C_i,dXn) =def. E 

»'=1 

Replacing M„ by Ml^' in the definitions of M„ and Gn-i, we obtain a 
particle approximation of the desired quantities 



~ M^'iC_„dXn) =def. ^ 

t'=l ) 



and 



1 

Gn-l(C_:) ^ Gtxit-l) =def. ^ E 



t' = l 



This strategy provides a “local” iV'-particle approximationjiodel for the 
evaluation of the potential functions and for the sampling of Mn transitions. 




3.3 Particle models with IDegenerate Potential 101 



The second strategy consists in still working with the McKean interpre- 
tation of the prediction flow associated with the collection of transitions 

f^n+l,r) ~ with T] € 'Pn{En) , Tl G N (3.5) 

It is important to note that the collection of transitions Kn^,, and particu- 
larly 5n,t) are not deflned on the whole set of distributions V{En) but on 
the subset Vn{En) of measures such that r){Gn) > 0. In this case, the 
particle interpretation given in Definition 3.2.1 is not well-defined since it 
may happen that the whole configuration moves out of the set E„. To 
describe rigorously the particle model associated with this McKean kernel, 
we proceed as in Section 2.5.1. We add a cemetery point A to the prod- 
uct space E^ and we extend the test functions and the mutation/selection 
transitions (S„,Mn) on E^ to E^ U {A} as follows: 

• The test functions <p € are extended to E/^ U {A} by setting 

(p„(S) = 0. 

• The selection transitions S„ from Ej^ into itself are extended into 
transitions on Ej^ U {A} by setting 5„(x, .) = <Ja as soon as m{x) = 

• The mutation transitions Mn+i from E^ into E^^^ are extended into 
transitions from E^U{A} into U{A} by setting A!„+i(A, .) = 

The interacting particle model associated with the McKean interpretation 
(3.5) and with an initial distribution G V{Eq) is a sequence of nonho- 
mogeneous Markov chains 



fl(^) = 1[{E^U {A}), = (jF„^)neN, e = (en)n6N, K 



n>0 



taking values at each time n in the product space Ej^ U {A}. These models 
are again defined by a two-step selection/mutation transition of the same 
nature as before. 






selection 



4 f„G£;^U{A} 



mutation 
> 



e„+i€£?^+jU{A} 



The only difierence is that the chain i^ killed the firat time n we have 
m($„) 0 Vn{En)- At that time, we set = A and ^ = A for all 

p > n. In this general context, it is important to estimate the distribution’s 
tails on the date at which the chain is killed 



tn = inf{n G N ; = A} = inf{n G N ; m(^„)(G„) = 0} 




102 3. Genealogical and Interacting Particle Models 



In this general context, it may happen that the “limiting” Feynman-Kac 
model itself is stopped. Let 

T = inf {n e N , r/n(Gn) = 0} 

be the first time the set G“^(0,oo) and the support of the measure rfn are 
disjoint. In this case, the distribution rfn cannot be updated anymore and 
the flow Tfn is only defined up to that time horizon. FVom the discussion 
given on page 71, the time r can also be regarded as the first time n we 
have 7n+i(l) = 0. If we set G„(a;o, • • • ,®n) = Ilp=o ^p(®p)> 
that T is the first time n we have 

P«„(G.) = / Gn("C0> • • • > ®n)^(^^o)-^l(®0) 1> — 0 

This shows that at that time the admissible paths of length n must visit at 
a given date p < n the set G"^(0). From these observations, it is intuitively 
clear that the path-particle genealogical model has the same property and 
we have < t. In Chapter 7, Theorem 7.4.1, we will prove that for any 
n < T and iV > 1 we have the exponential estimate 

(t^ < n) < a(n) exp (-iV/6(n)) (3.6) 

In particular, this shows that 

lim =t) = 1 

This result indicates that the particle model is asymptotically stopped at 
time T. 

In some instances, such as in finite graph analysis, one is interested in 
computing the depth r of a rooted tree E. In this context, we have are given 
a Markov chain X„ starting at the root and exploring the tree. When X„ 
visits a leaf, it is stopped. Otherwise, it chooses randomly one of the deepest 
vertex in its neighborhood. If we consider the indicator potential functions 
Gn = Ia of the leaves subset Ac E, then we clearly have 

7 „+i(l) = P(Vp<n, X„^A) = E[f[G^{X,) 

\p=0 

and the depth t coincides with the first time n we have 7n+i(l) = 0. The 
particle interpretation of this model is sometimes called the “go with the 
winner” [4, 263, 264] It consists in evolving N particles firom the root. 
When a particle reaches a leaf, it is killed and instantly a randomly chosen 
particle in E — A duplicates. The exponential estimate above implies that 
the particle model is as}unptotically stopped at time r; that is, when the 
particles have reached the set of deepest leaves. The running time of the 
algorithm, to acheive a given success probability, depends on the tree ge- 
ometry and on the way the probability 7 „+i(l) to reach a depth n tends 
to 0 (see for instance Theorem 7.4.1). 




3.4 Historical and Genealogical TVee Models 103 



3.4 Historical and Genealogical Tree Models 

In this section, we show that the particle interpretation of the Feynman- 
Kac model in path space represents the genealogy of the particle model 
associated with the time marginal model. We first start with a somehow 
heuristic description. Then we provide a rigorous mathematical proof. Fi- 
nally, we show that this result is related to more general transport problems. 
We end this section around this theme. 

3.4- i Introduction 

The best pedagogical way to introduce the historical model associated 
with a genetic type particle approximation model is to analyze the sit- 
uation where the underlying Markov motion is a path process. Suppose 
is an auxiUary nonhomogeneous Markov chain with initial distribution 
T)o € V{Eq) and elementary transitions from some measurable space 
into another measurable space eUi)- Also let Xn be the 

stochastic sequence defined by 

= xfo ,„](= {xi . . . , x ;)) e = 4 ,„,(= {e'ox...x e’j) 

We recall that Xn forms a nonhomogeneous Markov chain with initial dis- 
tribution rjo € V{Eq) and its transitions from En-i into Eny are defined 
in (2.3) on page 51. 

We consider a sequence of bounded potential functions Gn : (0, oo) 

and we suppose the energy of a path Xn = (xq, . . . , x'^) € En only depends 
on its terminal value. That is, we have Gn(xn) = G[^{x'^) for some function 
Gn on The Feynman-Kac model rjn associated with the pair (Gn, Mn) 
can be alternatively written for any fn € B{En) as r/n(/n) = 7n(/n)/7n(l) 
with the Feynman-Kac formulae 

7n(/n) = 

(3.7) 



Let 7T„ be the canonical projection from JS„ into E'^ defined by 

7T : Xn — (®o> ■ • • ) ®n) ^ ^ ^n{^n) ~ ®n ^ 

We associate with 7r„ the image mapping : V{En) V{E'„) defined 
for any r\ € V{En) and G by setting 

= V o K\K) = V{r-\A'J) 

The image measures Vn — Vn<> clearly the nth time 

marginal of r)n- In view of (3.7), the sequence of measmres t)'„ is the Feynmjm- 
Kac distribution flow associated with the pair (G(,, M^) and is defined for 




104 3. Genealogical and Interacting Particle Models 



any € B(-Bn) formulae 



n-l 



Vnia = <ifn)h'M) With Wn) = K>[fniK)YlW) 

\ P=0 > 



The iV-path particle model associated with the McKean interpretation (3.1) 
of the Feynman-Kac flow r;„ in path space consists of N path particles. That 
is, for any 1 <i < N and n 6 N, we have 

C = (C,n)o<p<n and c = (^’,n)o<p<n G = £[o „]. 

We recall that the motion of the iV-path particles is decomposed in two 
separate selection/mutation transitions; 






Sn,m({n) 



Wn+l 



During the selection transition, each path particle selects a new path 
particle randomly chosen according to the distribution 



•) = % + (> - *„(m&)) 



with the Boltzmann-Gibbs distribution 






h EjLl 



At this stage, it is convenient to make a couple of remarks. First, we note 
that the selection probabilities only depend on the terminal values 7Tn($^) = 
of the path. Firom this observation, we check that the terminal values 



„ of the selected paths are randomly chosen according to the distribution 
with m(^; Z!Li = M^n) o and 






V' ^n(^n,n) r 

h Ef=i oMU '■■■ 



Secondly, if the ith particle ^ has selected, say, the jth terminal state 

^n,n> we can alternatively define the ith path selection by setting ^ 

In other words, we have in a symbolic form 

^’n=^:?,n=^f; = Oo<p<n 




3.4 Historical and Genealogical Tree Models 105 



If we interpret the path particles as an ancestral line, we see that the path 
selection tends to select randomly the ancestral line of a current individual 
with high GJj-potential. 

During mutation, each selected path evolves randomly according to 
the transition Mn+i of the path process. By definition, this transition sim- 
ply consists in extending the selected path with an elementary move ran- 
domly performed with M^. In other words, we have 

= ((^p ,n-f-l)o<p<n> Cn-|-l,n-}-l) 

“ ((Cp*n)o<p<m ^ = En 'X E^^i 

where ^nVi,n+i a random variable with law .). In connection 

with the interpretation above, this elementary move can bejegarded asj;he 

mutation of the selected individual ^ with ancestral line (^o,n» » ^n,n) 

Prom the preceding discussion it should be intuitively clear that the 
marginal particle model is the two-step selection/mutation Markov model 



C.n e (K)" 






<+l 



e {Kf — ^ C+I,n+1 G iK+i)^ 



Furthermore, it coincides with the particle model associated with the McK- 
ean interpretation of the flow t/J, with Markov kernels 



3.4-2 A Rigorous Approach 
and Related Transport Problems 

Our next objective is to make rigorous the intuitive statement presented 
in the introductory Section 3.4.1. To this end, it is convenient to introduce 
the canonical mappings 

TT^: El! 

{x\,...,xH) ^ irH{x\,...,xll) = {nnix\),...,nn{xll)) 

Proposition 3.4.1 Let (^mCn) f>^ the N -path-particle model associated 
with the Feynman-Kac distribution flow rjn on path space. The stochastic 
process defined by 

C = (Cn) and = ttH (f„) 

coincides with the N -particle model associated with the Feynman-Kac flow 
Vn- 

Proof: 

We first prove that for any u„ = (uq, . . . , u(j) € En, n € N, and rj G V{En) 
we have 

Sn.riiU'ny •) ° '^n ~ 

.) o = M'^.i(7r„(u„), .) 



(3.8) 

(3.9) 




106 



3. Genealogical and Interacting Particle Models 



Let /' € Bb{En) and let us denote by = dvo x • • • x the infinitesimjd 
neighborhood of a point Vn = (vq, ...,v^)€ B„. A simple calculation shows 
that 

/ /n(^n(Vn)) Bn, tj (u„, dVn) 

J Efi 

= f /;«) ^^G’MKAdv,) + (1 - 

Jb„ »? O 7r„ (G'n) 

= / /;«) enG'MSKidv'n) 

+ [ /;«) (1 - <ng„(0) 1 0 K'(d<) 

Je'^ »/0 7r„ (G„) 

This implies that 

/ f'nMVn)) 5„.,K, du„) = / /;«) (7Tn(u„), dO 

*/En 

and the proof of the first assertion is completed. To prove the second one, 
we argue in the same way. We first note that for any Un-i G ^ > 1, 
we have 

/ fnMVn)) Mn{Un-l,dVn) = f fn{v'n) M^{irn-l{Un-l) , dv'J 
JEn JE'„ 

This ends the proof of these two preliminary results. We now come to the 
proof of the proposition. For any /„ € Bb{En) and n > 0, we find that 

Kifr^(^ni^n))\^n) = f (Xn)) f[SnMMi^nM) 

i=i 

= / fn{x'n)fl{SnMMiC-)<^^n^)(dx'^ 

,=i 

Using (3.8), we obtain 

r ^ 

<(/.("^({n)) Un) = / /„«) n«;.m«;.,( 5 ;‘n.'fcn) 



'(E'Jf' 
ifN/e tP 



Finally, we observe that 

E5i(/n+l(Cl(^n+l)) I fn) 

~ I /n+l(^n+l(®n+l)) JJ -^n+lC^n' d2^„+i) 

*'®n+l <=1 

r ^ ^ 

= / /n+l«+l) 

<=i 




3.4 Historical and Genealogical IVee Models 107 



Using (3.9), we conclude that 

<(/n+l«l(^n+l)) I fn) 

” /n+l(^n+l) rii=l '^n+l(Cn,n»^n+l) 

=<(/»«(&+,,„«) u;,n) 

Rrom the Markov description of the particle models, the end of the proof 
of the proposition is now straightforward. ■ 

On closer inspection, the proof of the proposition above shows that this 
result is related to a more general transport problem. Suppose f^n is a given 
sequence of distributions on some measurable space with a McKean 
interpretation 

Vn+l — Vnf^n+l,rin 

where Kn+i,n is a well-defined collection of Markov transitions from En into 
^'n+l- We consider an auxiliary collection of measurable spaces {E'„, £^) and 
a measurable mapping iTn : En -¥ E'^. As before, we introduce the image 
measures of with respect to )t„ and defined by 

Vn = VnOn-^€V{E'J 

We assume there exists a McKean interpretation 

'In+l = Vn^n+l,v'„ 

in terms of a collection of Markov kernels f}' € V{E!^)y from E'^ 

into K+i- Note that the existence of a McKean measiure is equivalent to 
the fact that the sequence satisfies a recursive equation. Using the same 
arguments as the ones used in the proof of Proposition 3.4.1, we prove the 
following result. 

Proposition 3.4.2 Suppose the collections of Markov kernels Kn+i,t) and 
the compatibility condition 

Kn+l,ri{Xnt •) °^n+l ~ (3.10) 

for any {xn,f]) € {En x V{En)) and n G N. Let ^„ € E^ be the N- 
particle model associated with the collection of transitions Kn+i,t) and with 
a measure tjo € P(Eo). Then, the stochastic process defined by 

<{U = {^nia),---,^n{^i))^{K)^ 

coincides with the N -particle model associated with the collection of tran- 
sitions K^+i,»}' measure »/o = ’to ° ^ P{^o)- 




108 3. Genealogical and Interacting Particle Models 



We can use this transport property to construct genealogical tree mod- 
els of various classes of interacting jump particle models. Although we 
didn’t work out the continuous time version of this proposition, we be- 
lieve that this strategy can be used to analyze the genealogical structure of 
Nanbu type and colliding particles interpretations of Boltzmaim’s rarefied 
gas models. 

3.4 3 Complete Genealogical Tree Models 

In this section we give a brief discussion on complete genealogical tree 
models. We suppose E„ are defined by 

En = E{o,n]i=E'o>^---^K)) 

We let iTn be the canonical projection from En into E'„ and we suppose 
that Kn+i,t) is a given collection of Markov transitions from E„ into ^-n+l- 
When conation (3.10) holds for some Markov transitions from E'^ 

into we have two distribution fiows: 

Vn+l = ^ 

Vn+1 = = »?n+l O € V{E'„) 

Let € ViE'^Q „|) be the McKean measure associated with and 

Tfo = r}o€ V{Eo) and given by 

K,,n>^d{x’o, x;)) = J?o(dx(,) (x(„ dx'i) . . . ix'n-i,dx'J 

We observe that the sequence of distributions satisfies the reciursive 
equation 

K+l,vo = Ki,vo^^n+l,v'„ (3-11) 

If we set for each € 'P(£^[o,n]) 

• • • )®n)>^(yo> • • • 'l^n+l)) 

= • • • .yJ,)) ^n+l.M„o,->(j^n.dyn+l) 

then (3.11) can be rewritten in the following form 

Ki»+i,7o = ^ ^(^n+l) 

By definition, the collection of transitions from En into En+i 

satisfies condition (3.10); that is, we have for any fin 6 V{En) and x„ e E„ 

Kn+i,^„(Xn, .) o = f^„+i_^^o,r-i(^n(®n)) •) 




3.5 Particle Approximation Measures 



109 




FIGURE 3.1. Complete genealogical tree 



To illustrate these abstract constructions, we notice that, in the case of 
the Feynman-Kac models examined above, the distributions ri„ and 
represent respectively the Feynman-Kac path measure associated with the 
psdr {G'n,M'^ and the McKean measure associated with From pre- 
vious considerations, the IV-interacting-particle model 

associated with gives the genealogies of the iV-particle model 
associated with Furthermore, the iV-path-particle model 

coincides with the N-psXh particle model associated with Kn,^i. In contrast 
with the genealogical tree model the system Cn represents the complete 
history of the particle model 

In Figure 3.1, we have represented with a thick line the genealogical tree 
of the current set of iV = 7 individuals and by thin Hnes the complete 
genealogy of the particle evolution model since its origin. 



3.5 Particle Approximation Measures 

In Section 3.2 and Section 3.4, we introduced several particle interpretations 
of Feynman-Kac distribution flows, including path-space and genealogical 
tree-based modeb. Chapter 12 is devoted to selected applications in statis- 
tics, physics, biology, operations research, and signal processing. In each of 
these domains particle models have a different interpretation. They can be 
regarded alternatively as a particle simulation technique or as a microscopic 
particle interpretation of some physic^d or biological equations as well as a 




110 3. Genealogical and Interacting Particle Models 



stochastic and adaptive grid approximation technique. The asymptotic be- 
havior of particle models as the size of the systems and the time parameter 
tend to infinity will be discussed in the further development of Chapter 7. 

The objective of this section is mainly to motivate the abstract construc- 
tions developed in earlier sections as well as to illustrate the forthcoming 
applications and convergence anal)rsis. We first present a nonexhaustive 
catalog of particle approximation measures. In Section 3.5.1, we provide 
some convergence estimates. In Section 3.5.2, we discuss the main regular- 
ity properties used in this book. 

At the risk of repetition, we briefiy recall the definition of the main 
Feynman-Kac models introduced so far. Let (E^, 5' ), n e N, be a sequence 
of measurable spaces. We denote by X„ = -^(o,n]> ^ N, the historical 

process associated with a nonhomogeneous Markov chain with transi- 
tions Af^+i fi'om into K+v By Mn we denote the Markov transitions 
of the path process X„. We also consider a collection of bounded and non- 
negative potential functions G'„ on E^, and we let G„ be the extension of 
G'n to the path space E„ = defined by G„(X(,,...,x'„) = G'„(x'J. 
To clarify the presentation ana i^ess otherwise stated, we assume that 
the potential functions are strictly positive. The Feynman-Kac prediction 
fiow T/'„ e 'P(E^) associated with the pair (G'„, M^) is defined for any test 
function € Bb{E'„) by the formula 



<{fn)=iMhnW With 



The measures qj, are the marginal distributions of the Feynman-Kac pre- 
diction model € V{En) associated with the pair (G„, M„) and defined 
for any test function /„ € Bb{En) by 

/ n-l 

r?n(/n) = 7n(/n)/7n(l) with 7n(/n) = /„(X„) J] Gp(Xp) 

\ P=o 



The corresponding updated models are defined by 



Wn) = V'MnGWM = with %{/:,) = ^'MnG'n) 



and 



Vnifn) — Vn{fnGn)/Vn{^) “ 7n(/n)/7n(l) with 7n(/n) “ 7n(/nG^n) 

We finally recall (see Proposition 2.3.1) that the unnormalized models 
(7n, 7n) can be written in terms of the normalized distribution flows (r/n, fjn) 
with the multiplicative formulae 

n— 1 ri 

7n(/n) = T/n(/n) %{fn) = Vn{fn) U 

p=0 p=0 




3.5 Particle Approximation Measures 111 



Suppose we are given a McKean interpretation of the flows 
Vn+l = VnK+hr,'„ 

with a collection of “compatible” Markov kernels 6uid 

■f^n+i,*j = S'n.fjMn+i (see Section 2.5.3 and Section 3.4). The corresponding 
McKean measures (see Section 2.5.2) are defined by 



K,»,n(d(xo,...,x„)) 

K;,,„(d(x(„...,x;)) 



T/o(dxo) ^i,f|o(xo) dxi) ... ACji 

iVn — l (x„_i,dx„) 
Tj'oidx'o) ic[,,,^ix'Q,dx[) ... K:;„j^_^(x;_i,dx;) 



The iV-particle models associated with K„,,, consists of N path ptu1:icles 



“ (^p,n)o<p<n and — (Cp*n)o<p<n G £/fj — ^[0,n] 

Furthermore, the “marginal systems” (C,n)i<»<N (Cn)i<<<N on 

coincide with the iV-particle model associated with 
We are now in position to present a list of particle approximation models 
of the previously deflned distributions. We use the superscript to define 
the particle approximation measure of a measure i/ that is in some sense 
limA^-foo*^^ = The particle approximation measiures of the McKean 
distributions ®^o,n ar® given by 




i=l 
1 ^ 
t=l 
1 ^ 



t=l 



The particle approximation measures of the prediction models (7n>>?n) are 
defined by 



7„^(.) = 



n 

.P=o 



X With Vn = 

i=l 



j 



The resulting particle approximation measures of the “marginal” prediction 
models {‘y'nfVn) clearly given by 






'n— 1 

n 

.p=0 






X Vni‘) with 



Sen 

Sn,n 



i=l 




112 3. Genealogical and Interacting Particle Models 



The iV-particle approximation of the updated models can be de- 

fined in two different ways. First we can choose the updated path-particle 
model 






n 

j >=0 






^(.) with 

t=i 



Alternatively, we can choose the updated versions of the distributions 
and 7 ^ defined respectively in terms of the Boltzmann-Gibbs transforma- 
tions 



7^'(.) = 



tlv^iGp) 



Lp=o 



X (•) with fj^ (din) = 



When the potential functions are not strictly positive, the particle model 
is generally stopped the first time tn the whole configuration visits the set 
G“^(0) (see the end of Section 3 . 2 ). In this situation, we use for instance 
the particle approximation measures 



7^(0 = 



“n-1 

n 

.P=0 



v'p^iG'p) 



1 ^ 

X (.) With r,^ = l^,,>n . ^ ..€«„) 

t=l 



3.5.1 Some Convergence Results 

To guide the reader, we provide hereafter a brief and informal discussion on 
the asymptotic behavior of these particle measures as the size of the system 
N tends to infinity. This short discussion will help the reader in appreciat- 
ing the impact and the usefulness of the preceding particle interpretations 
in the set of applications presented in Chapter 12. We will develop the 
complete proofe of these results and the precise set of conditions on the 
pair (G„,Mn) in the further development of Chapters 7 to 10. 

A surprising result at first sight is that 7 ^ is an unbiased estimator; that 
is, we have for any /„ € Bb{En) 

Ej(-rn(/»)) = 7n(/n) 

In fact, we will see that 7 ^ can be regarded as the terminal value of a mar- 
tingale. This observation is also the stepping stone of a powerful semigroup 
approach for stud}dng the fiuctuations of these particle models. 

In a first stage of analysis, we will develop a collection of Lp-estimates 
using martingale decompositions and limit theorems for processes. For in- 
stance, we will show that 

Viv<(h^(/n) - 7n(/n)n'/" < a(p) 6 (n)||/|| 




3.5 Particle Approximation Measures 113 



We will also extend these estimates to the empirical processes '■ fn ^ 
Tn — > Vnifn) associated with a given countable collection of uniformly 
bounded functions C These results at the process level lead 

to the version of the Glivenko-Cantelli theorem for a particle model. In 
particular, we will prove that 






sup \’£iJn)-rhUnW 

/n€^„ 



1/P 

< a{p)b{n) I{J^n) 



for some finite constant I{^n) < oo that only depends on the class !Fn- 
Similar but exponential type estimates will also be covered. For instance, 
we will check that for any e > 0 and N sufficiently large 



nN 

”»lo 



sup \Vnifn)-Vn{fn)\>£ 

n 



^ ^n) ^ 



Ne^/b(n) 



with a finite constant dn(€,^n) depending on e and on the class Tn- From 
these estimates and using the Borel-Cantelli lemma, we conclude the almost 
sure convergence result 

lim sup |t/^(/n) - r?„(/n)| = 0 



When the Markov transitions are sufficiently regular, we will obtain a series 
of uniform convergence results with respect to the time parameter. For 
instance, we will prove that for sufficiently mixing kernels and for any 
parameter p > 1 



Vn sup 



sup Wn^{a-Vnifn)\^ 



< a{p) b /(JF) 



for another finite constant I{^') < oo whose values depend on the collection 
of functions C Bb{E*„). Nevertheless, without getting into more 

detail, we mention that the uniform estimates with respect to the time 
parameter will not apply in the path-valued situation and for genealogical 
models. 

The corresponding fluctuations and large deviations will also be discussed 
in Chapter 9 and Chapter 10. Roughly speaking, these results will give the 
exact asymptotic deviations of around the limiting distribution 

Another interesting way to measure the performjmce of these particle 
models consists in studying the adequacy of laws and the independence 
between the particles. These propagation-of-chaos results are concerned 
with the asymptotic behavior of the distributions of the particle paths (see 
Chapter 8). To describe these results in some detail, it is convenient to 
introduce a few notation. 




114 3. Genealogical and Interacting Particle Models 



For each n > 0 and N > 1, the state space Eq x ... x represents 
the set of paths from the origin up to time n of the Markov particle model, 
while the product space {Eq x ... x En)^ represents the state space of 
the N paths of each elementary particle from the origin up to time n. To 
connect these two spaces, we introduce the change of coordinate mapping 

:{E^ x...xE^)—^{Eox...x 

defined by [(x^)i<i<N, • • • , (Xn)i<<<Af] = (xj,, . . . , x|,)i<<<w. The map- 
ping above coimects the paths of the psurticle Markov chain with the ele- 
mentary particle paths 

©n =def. (^fo,n])l<»<JV = %,n) 

We recall that the particle Markov chain is defined on the canonical 
space 

= n ^)neN, 



The restriction of on the sequence of paths from the origin up to time 
n is given by 

K.n = K ° (^0, • • • G X ... X 

The 6^ image of elementary particle paths (Eq x . . . x E„)^ 

is defined by 

p(N) _piV (QNyl 
* T7o,n ^ V^n J 

With some obvious abusive notation, we have 

= K,n ° ^(0.nl = Law((Cj, . . . , Ox<i<N) 

We denote by P^n\ q<N, their marginals on the first g-particles 
Pa^ = Law((ej,...,ai<K,) 

and we let = Law((^^)i<i<,) be their n-time marginals. Without 

any regularity assumptions on M„, we prove first a rather crude estimate 

When the Markov kernels Mn are regular enough, we will obtain a propagation- 
of-chaos estimate with respect to the relative entropy criterion 

7VEnt(pW«MlCn)<ft(«)9 




3.5 Particle Approximation Measures 115 



Under some additional mixing conditions, we will see that 

" ““P IICh - S ‘(«) ^ I Cn) S » « ‘ 

n>0 

F^om these uniform estimates, we obtain increasing propagation of chaos 
with respect to increasing time horizons and particle block sizes. More 
precisely, for any increasing sequences q = q{N) f oo and n = n{N) f oo, 
we have 



,(JV)n(W) = m =!• Jta I ,) = 0 



3.5.2 Regularity Conditions 

Besides the modeling and the appUcations of Feynman-Kac and particle 
methods, a rather large part of the book is concerned with qualitative and 
asymptotic properties of the models as the time horizon or the size of the 
system tends to infinity. We discuss topics covered by traditional books on 
ordinary Markov chains and Monte Carlo methods: stability of semigroups, 
existence and uniqueness of invariant measures, annealed and concentration 
properties, weak and strong laws of large numbers, fluctuation and large- 
deviation principles, and propagation of chaos. 

The study of each of these topics requires a specific mathematical tech- 
nique and a precise set of regularity conditions on the pair of poten- 
tials/kernels (Gn, M„). Apart from some purely technical assumptions, that 
we have made to simphfy the analysis, most of the book is based on two 
types of regularity conditions. As a guide to their use, we provide in this 
section a short discussion on these conditions. 

Before entering into more detail, we first recall that the Feynman-Kac 
distribution flow associated with an abstract and general pair of poten- 
tials/kernels (G„, M„) is in general well-defined only up to the first (de- 
terministic) time T we have Ef>o(rip=o^p(-^p)) “ deterministic 

horizon t depends on the triplet (%, Gn> Afn)- In this situation, we have 
seen on page 102 that the IV-particle model is also stopped the first time 
t) the whole configuration falls outside the support of the current 
potential function. This generaJ situation, which might appear as a study of 
Feynman-Kac models with minimed regularity structure, will be discussed 
in Chapter 7 (see also Remark 9.4.1 on page 301, and Section 12.2.7). 

The analysis is technically less involved when the potential functions sat- 
isfy the following condition: 

(G) There exists a sequence of strictly positive niunber Cn(G) 6 (0, 1] 
such that for any x„,j/„ € En 

Gn{Xn) > €n(G) G„(j/n) > 0 




116 3. Genealogical and Interacting Particle Models 



Most of the topics developed in this book will be based on this single 
regularity condition, except the following subjects: 

• Stability of Feynman-Kac semigroups. 

• Uniform convergence results with respect to the time parameter. 

• Relative entropy and Lp propagation-of-chaos estimates. 

• Fluctuations and large-deviation principles on path space. 

To study these questions, we will suppose that the collection of distributions 
Mn+i{xn, •) are absolutely continuous with one another. That is, for each 
n > 0 and x„,y„ € E„, we have M„+i(x„, .) < M„+i(y„,.). We will 
strengthen this condition in three ways: 

• {M)m There exists some integer m > 1 and some sequence of num- 
bers Cp(M) € (0, 1) such that for p and Xp, pp 6 Ep we have 

■Mp^p+n»(Xp, .) = Mp^\Mp^2 • • • ^p+m{Xp> •) ^ ^p(-^) ^PtP+miVpi •) 



• For each n > 1 and Xn-i e En-i, M„(x„_i, .) is absolutely 
continuous with respect to and we have 

Mn(Xn— 1, dx^) = kn{Xn—iyXn) fJnidXn) 
with supj^_,g£;^_, *:„(x„_i, .) 6 Lp(p„). 

• For each n > 1, there exists a reference probability measure 
p„ € V{En) and a measurable function a„ on E„ such that for any 

Xn-l 6 En-1 



Mn(Xn— l,dXn) — rU,j(Xf|_i, X«) Pn(dXn) 

and supj^ jgE^ j |logm„(x„_i,x„)| < o„(x„) with p„(e’’“’‘) < oo, 
for any r > 1. 

To guide the reader, we mention the different places in which these condi- 
tions are used. 

• Under (G), we will study the asymptotic behavior of particle density 
profiles and genealogical-tree occupation measures: 

- Exponential and Lp meem error estimates. 

- Weak convergence of empirical processes. 

- Central limit theorems. 

- Propagation-of-chaos estimates for the total variation norm. 




3.5 Particle Approximation Measures 117 



• Under (G) and {M)m, we will discuss: 

- Contraction and asymptotic stability properties of Feynman- 
Kac semigroups. 

- Annealed properties of Feynman-Kac flows. 

- Spectral estimates of Feynman-Kac-Schrodinger semigroups. 

- Uniform estimates for particle density profiles with respect to 
the time parameter. 

- Long time behavior of genealogical tree-based models. 

• Under (G) and we will discuss entropy propagation-of-chaos 

estimates. 

• Under (G) and (M)®’'*’ we will discuss fluctuation and large-deviation 
principles for particle models in path space. 

Note that condition is met as soon as we have 

Afn(®n— li^n) ^ hn{Xn) Mn{Xfi—ijdXn) (3.12) 

for some (0, l|-valued function hn € Lp(t/„) (•«= ||Afn(ftn)ll < prove 

this claim we first recall that fjn-iMn = Vn- Integrating (3.12) with respect 
to ^n-i we find that 

€ (l//l„(x„),h„(x„)] 

In the same way, we conclude that (Af)®^^ holds true with (mn,Pn) = 
(A:n,r/n) as soon as hn e r\p>oLp{r)n)- 
To illustrate this condition we note that the one dimensional and Gaus- 
sian transitions 

M„(x„_i,(ix„) = -4= exp{- J(x„ - a(x„_i))*} dx„ 
y/2ir ^ 

satisfy (3.12) with /i„(i„) = exp{osc(o„)[|x„| + ||on||]} as soon as ||o„|| < 
00 . We also have the following 

Lemma 3.5.1 

{MU=i (Af)®’'P =► ( (M)(P) for any p > 1 ) 



Proof: 

To prove the first implication, we choose a point x* £ E and we set 

dM lx ) 

p„{dy) = Mn{x*,dy) and mn(x,p) = 




118 3. Genealogical and Interacting Particle Models 



When the uniform mixing condition (M)m holds true for m = 1, we have 
for any x,y ^ E 

fn-l(M) < m„(x,y) < \rjr 

We conclude that (M)**p is met with a„(y) = - loge„_i(M). Let us prove 
the second assertion. Suppose we have 

e-an(») p„(dy) < Mn{x,dy) = mn{x,y) Pn{dy) < Pn{dy) 

for some pair (m„,p„) with, for any p > 1, / exp (p On) dpn < oo. A simple 
calculation shows that for any choice of the reference probability measure 
Pn € V{E) and for any p > 1 we have 




one concludes that 



It is now easily checked that J g(2p i)on(y) p^{dy) ^ for any p > 1. 

The end of the proof is now clear. ■ 

We illustrate these conditions with two tj^ical examples. Other situations 
will be examined on page 148. Note that for time homogeneous models 
on finite spaces condition (M)m is met as soon as the Markov chain is 
aperiodic and irreducible. 

Example 3.5.1 Suppose that and Mn is given by 

Mn{x,dy) = (( 2 ,r)<^jg^|)i / 2 ~ (y “ ^n(x))^ dy 

where Q„ is adx d symmetric nonnegative matrix and : R** R** is 

a bounded function. Using previous observations, it is not difficult to check 
that is satisfied with 

and 

logm„(x,p) = -^(y - An(x))' Qn^ (y - A„(i)) + iy' Q~^ y 
= Mx) Qn^ y - ^A;(x) Q-^ An{x) 




3.5 Particle Approximation Measures 119 



To see this claim, it suffices to observe that 

|logm„(x,y)| = |^n(x)'Q;^y| + ^ |A„(x)' A„(x)| 

< an(y) = ||A„|| |||g;i||lx IMIi + ^ \\Anf |||g-i||li 
where ||y||i = X)f=i |y*l> 

d 

\\An\\= sup sup|A^(x)| and |||g;‘|||i = sup 

l<i<d I i<j<d^ 

Example 3.5.2 For = R and Mn given bp 

Mn{x,dy) = g-c(n) l»-.^n(i)l 

for some c{n) > 0 ond osc(i4n) < oo, condition (M)m holds true form = 1. 
Indeed we have 

log ji^”|y’ ' j (^) = c(n) [ |z - A„(y)| - k - A„(x)| ] 

Recalling that ||x - a| — |t: - 6|| < |6 - o|, we readily find that 

dM (t 1 

log ^ o(") l^n(x) - An{y)\ < c(n) osc(A„) 

We conclude that the mixing condition {M)m holds true for m = 1 and 

e„_i(M) = exp(-c(n) osc(A„)) 




4 

Stability of Feynman-Kac Semigroups 



4.1 Introduction 

This chapter is devoted to structural and stability properties of Feynman- 
Keic semigroups. These regularity properties appear to be centred in the 
understanding of various topics discussed in this book. They will be ap- 
plied in Chapter 5 to analyze the existence and the uniqueness of invariant 
measures, and in Chapter 12 they are used to study the as}nnptotic star 
bility of nonlinear filtering equations. In Chapter 6, we will also use these 
results to analyze the concentration properties of sumealed Feynman-Kac 
semigroups associated with a cooling schedule. Their applications to the 
study of the long time behavior of particle methods will be the maun object 
of Chapter 7. 

To explaun and motivate the orgamization of this chapter, it is convenient 
to observe that for constant potentiad functions Feynman-Kaic models co- 
incide with traulitionad Markov semigroups. The study of the stability of 
Markov models is one of the most active reseairch subjects in probability 
theory. We refer the reader to traditional textbooks on this theme (see for 
instance the book [250] amd references therein). Among the variety of ap- 
proaches developed in this field, R.L. Dobrushin introduced in a two-part 
article [116] in 1956 a powerful measure-theoretic technique for studying 
the contraction properties of a Markov kernel with respect to the totad 
vau-iation distance. The main feature of this approach is that it applies 
to Markov tramsitions on generad measurable spaces and it does not use 
amy assumptions on the invairiant measures of the chain. These ideas were 




122 4. Stability of Feynman-Kac Semigroups 



pursued and extended in [93] with a systematic study of Markov contrac- 
tion and ergodic constants with respect to a general class of distance-like 
entropy criteria. The essentials of this study are provided in Section 4.2. 
Section 4.3 is concerned with the extensions of these results to Feynman- 
Kac semigroups. 

The rest of the chapter has the following structure. In Section 4.3.1, we 
provide several functional inequalities in terms of the Dobrushin ergodic co- 
efficient of a Feynman-Kac type Markov kernel and in terms of the relative 
oscillations of a sequence of potential functions. In Section 4.3.2, we pro- 
pose a semigroup approach to control these two quantities. In Section 4.3.3, 
we apply these results to derive a collection of strong contraction estimates 
with respect to a fairly general class of relative entropies. Another impor- 
tant feature of this approach is that it applies to estimates of the ‘Sveak” 
regularity of Feynman-Kac semigroups. These questions are discussed in 
Section 4.3.4. The kinship between the stability properties of updated and 
prediction semigroups is studied in Section 4.4. We complete this chapter 
with an application of these results to the study of asymptotic stability 
properties of a class of stochastic Feynman-Kac semigroups arising in non- 
linear filtering (see Section 4.5). 



4.2 Contraction Properties of Markov Kernels 

Let {Ey S) and (F, F) be a pair of measurable spaces. In this section, we 
develop general contraction properties of Markov kernels M from (Ey£) 
into (F,F). We provide Lipschitz type estimates with respect to various 
distance-like criteria. Since the state space does not play a distinguished 
role, we simplify the presentation and we use the same notation to denote 
any relative entropy criteria on the set of measures on possibly different 
state spaces. 



4.2.1 h-relative Entropy 

Let /i : R^. R U { 00 } be a convex function satisfying for any a,x,y € R+ 
the following conditions: 

h(ax,ay) = ah{x,y) and h{l, 1) = 0 

We associate with this homogeneous function the h-relative entropy on 
M+{E) defined symbolically as 

= j h{dp,dv) 




4.2 Contraction Properties of Markov Kernels 123 



More precisely, by homogeneity arguments, the mapping H is defined in 
terms of any measure A G M{E) dominating /x and i/ by the formula 

I (4.1) 

To illustrate this abstract definition and motivate the forthcoming analysis, 
we provide hereafter a collection of classical /i-relative entropies arising in 
the literature. First we come back to the definition of /i-entropy. We denote 
by h' : R+ R U {+ 00 } the convex function given for any i € R+ by 
h'(x) = h{x, 1). 

By homogeneity arguments, we note that h is almost equivalent to h'. 
More precisely, only the specification of the value h(l, 0) is missing. In most 
applications, the natural convention is h(l,0) = 00 . 

The next lemma connects the h-relative entropy with the h'-divergence 
in the sense of Csiszar [68]. 

Lemma 4.2.1 Assume that h(l,0) = + 00 . Then, for any n and v € 
M+{E), we have 

H{n,u) = j h' du (4.2) 

if and H(n, 1 /) = 00 otherwise. 

Proof: 

Let /X = /zi + /12 be the Lebesgue decomposition of n with respect to u. 
That is, we have that /jii t/ and ^2 -L Also let A € £ be such that 
i/(A') = 0 = fi 2 {A). To compute we can take A =aef. v + ^2 in 

(4.1) and we get 

= jji{^,^dv^ j^ h{\,(H)dti2 
= /f(/ii,i/) + Mi,o)/i2(B) 

If H 2 {E) > 0, we deduce that H{fi, i/) = + 00 . Otherwise, we take in (4.1) 
A = 1 / and we get (4.2). This ends the proof of the lemma. ■ 



In the reverse angle, suppose h' : R+ R U { 00 } is a given convex 
function. Since t € (l,+oo) i-4 (/i'(t) - h'{l))/{t - 1) is nondecreasing, the 
limit lo = linit-++oo h'{t)/t exists and for any I € [loi +oo| we can prove 
that any function defined for any (x, y) € R+ by 



h{x,y) 



( y h'(x/y) , if y > 0 

\ lx , if y = 0 



(4.3) 



IS convex. 




124 4. Stability of Feynman-Kac Semigroups 



• H we take h'{t) = |t - 11’’, p > 1, we find the Lp-norm given for any 
fi,v€ V{E) by H{n, i/) = ||1 - dp/di/||p ^ if p < i/, and oo otherwise. 

• The case h'{t) = flog(t) corresponds to the Boltzmann entropy or 
Shannon-Kullback information. In this situation, we find for any 

H{n,v) = Ent(p|t/) = y^ln dp 
if /X < 1 / and oo otherwise. 

• The Havrda-Charvat entropy of order p > 1 corresponds to the choice 

h\t) = - 1). In this case, we have for any p < i/ 

=def. ^ [I du-l 



Notice that Cp{n, v) -> Ent(p|i/) as p tends to 1+. 

• The Hellinger and Kakutani-Hellinger integrals of order a € (0, 1) 
correspond to the choice h'{t) = t — t°. For any p, G V(E), for any 
dominating measure A, we have 






‘-/(I)”©"” 



It is also sometimes written symboUcally in the form 



HaM =def.l - 1 (dp)“(di/)’-“ 



Notice that it can be rewritten more simply as 




di/ if 



p < I/. In the special case a = 1/2, this relative entropy coincides 
with the Kakutani-Hellinger distance defined by 



^l/2(P. v) =def. 2 



dp 

dX 



du 

dX 



2 

dA 



or, symbolically, 'Hi/ 2 (p, 5 / (\/3p - \/di/^ . 

• Finally, the case h'{t) = |t - l|/2 corresponds to the total variation 
distance defined for any p, 1 / G M+{E) 

= Up -H itt, 

For later use, we have collected in the next lemma three equivalent 
representations of the total variation distance. 




4.2 Contraction Properties of Markov Kernels 125 



Lemma 4.2.2 For any pair of probability measures (mi, m 2 ) on Ej 
we have 

||mi - m 2 ||tv = sup{|mi(/) - m 2 (/)| ; / 6 Osci(£;)} (4.4) 
= 1 - sup v{E) (4.5) 

n 

= 1 - inf ^(mi(i4p) A m2(i4p)) (4.6) 

p=i 

where the infimum is taken over ail finite resolutions of E into pairs 
of nonintersecting subsets Ap, 1 < p < n, with n > 1. 

Proof: 

To prove (4.4), we recall that the total variation distance between 
two probability measures mi and m 2 can alternatively be de6ned in 
terms of a Hahn-Jordan orthogonal decomposition 

m = mi - m 2 = m'*' - m“ 

with ||mi - m 2 ||tv = m'*'(E) = m~{E). FVom this observation, we 
have for any / e Osci (E) 

|mi(/)-m 2 (/)| 

= |y fix) m^{dx)-j f{y) m-{dy) 

We conclude that 

\miif) - m 2 (/)| < ||mi - m 2 ||tv 

By taking the supremum over all / € Osci{E), we find that 

sup{|mi(/) - m 2 (/)| ; / G Osci(f;)} < ||mi - m 2 ||tv 

The reverse inequality can be checked easily by noting that the indi- 
cator functions lyi, with Ae€, belong to Osci(£^). Now we come to 
the proof of (4.5). By construction, there exist two disjoint subsets 
E+ and E- such that 

m^{E-) = 0 = m~{E+) 

Therefore we have for any A£€ 

m"*'(j4) = m(An E+) > 0 and m~ (j4) = -m(i4 D ) > 0 




126 4. Stability of Feynman-Kac Semigroups 



from which we conclude that 

mi(i4nE+) > m2(j4nE+) and m 2 {Ar\E-) >mi{Ar\E-) {A.7) 
Let u be defined for any Ae£ hy 

u{A) = mi{A n E^) + m^iA n E+) 

By construction, we have 

i/{A) < mi(i4) A m 2 {A) and u{E) = m\{EJ) + m 2 {E+) (4.8) 

Since 

||mi - mzlltv = m+(£;) = m{E+) 

= mi{E+) - m 2 {EJ) = 1 - {mi{E+) + m 2 {E-)) 

by (4.8) we obtain 

1 - sup fi{E) < 1 - i>{E) = ||mi - m 2 ||tv 

The reverse inequality is proved as follows. Lret /x be a nonnegative 
measure such that for any A ££ we have 

fi{A) < mi{A) A m 2 {A) 

If we take A = and then A = , we necessarily have 

^i{E^) < mi{E^) and fi{E-) < m 2 {E^) 
and therefore 

fi{E) < mi{E+) + m 2 {E-) = 1 - ||mi - m 2 ||tv 
We conclude that 

1 - n{E) > ||mi - m 2 ||tv 

Taking the infimum over all the distributions n<mi and m 2 , we find 
the desired result. To prove (4.6), we use the same ideas and notation 
as above. First, we note by (4.7) that 

m 2 {E+) = mi{E+) A m 2 {E+) and mi{E-) = mi(£^_) A m 2 {ES) 

This implies that 

v{E) = mi (£?_) + m 2 (£?+) 

= {mi{E-) A m 2 (E-)) + (mi(E+) A m 2 (E+)) 




4.2 Contraction Properties of Markov Kernels 127 



Since E+, E- are disjoint, we conclude that 

n 

i/(E) > inf Y^{mi{Ap) A m 2 {Ap)) 

p=i 

where the infimnm is taken over all resolutions of E into pairs of 
nonintersecting subsets Ap, 1 < p < n, n > 1. To prove the reverse 
inequality, we come back to the definition of u. By (4.8), for any finite 
resolution ^4p € 5, 1 < p < n, we have 

t/{Ap) < mi{Ap) A m2iAp) 



and therefore 

u{E) = ^i/(.4p) < ^(mi(>lp) A m2{Ap)) 

p=l p=l 

We end the proof of (4.6) by taking the infimnm over all resolutions. 
Since i/(E) = 1 - ||mi - m 2 ||tv the end of the proof of the lemma is 
now straightforward. ■ 



4.2.S Lipschitz Contractions 

In this section, we discuss the regularity properties of a Markov kernel M 
with respect to the h-relative entropy. We provide a universal Lipschitz 
inequahty in terms of the Dobrushin ergodic coeflScient. In Section 4.3, we 
will use this contraction estimate to study the stability properties of non- 
hnear Feynman-Kac semigroups. Before getting into the precise description 
of this inequahty, we recaU the definition of the Dobrushin coefficient and 
provide some key properties. We recall that the total variation distamce on 
M{E) is defined for any p € M(E) by 

IlMlItv = \ sup in{A) - n{B)) 

Definition 4.2.1 The Dobrushin contraction or ergodic coefficient 0{M) 
of Markov kernel M from {E, £) into (F, T) is the quantity defined by 

/3(M) = sup{l|M(x, .) -M(y, .)||tv ; (x,y) € € [0,1] 

Proposition 4.2.1 Let M be a Markov kernel from {E,£) into (F,F). 
For any measure p € M{E), we have the estimate 



llpMjjtv < 0{M) llpjjtv + (1 - 0{M)) \niE)\/2 



(4.9) 




128 4. Stability of Feynman-Kac Semigroups 



In addition, 0{M) is the operator norm of M on Mo{E), and we have the 
equivalent formulations 

0{M) = sup l|/iM||tv/||Ml|tv (4.10) 

h€Mo{B) 

= sup {osc(Af(/)) ; /€Osci(F)} (4.11) 

n 

= 1 - inf J](M(x, Ap) A M{y,Ap)) (4.12) 

p=i 

where the infimum is taken over allx^y € E and all finite resolutions of F 
into pairs of nonintersecting subsets Ap€F, 1 < p < n, n > 1. 

Proof: 

We first prove (4.9) for p € Mo{E). Arguing as in the proof of Lemma 4.2.2, 
by the Hahn-Jordan decomposition theorem, we can write any signed mea- 
sure p as the difference of two nonnegative and orthogonal measures p = 
p+ - p~. Let E+ amd E- be two disjoint subsets such that 

p+{E.) = 0 = p-{E+) 

We also recall that in this case 



IlMlItv = ip^{E) + p-{E))/2 = (p(E+) - p{E.))/2 

When p has a null total mass, we clearly have ||M||tv = p{E+) = -p{E.). 
Now, for any A € .F, we observe that 



pM{A) = p(1b^M(1a)) + p(Ib-M{1a)) 

< p{1e^M{Ia)) + W(M(p, A)) p{E.) 

y€E 

= f [sup(M(x, A) - M{y, A))) p{dx) 

JE+ V€E 



Taking the supremum in the r.h.s. integral and then over all A € .F, we 
conclude that (4.9) holds true for any p € Mq{E) and 



0{M) > sup 



llplltv 



To complete the proof of (4.10), we note that for any x,y £ E we have 
(tf* - 5y) € Mo(E) and ||5* - tfy||tv = 1- This yields the desired reverse 
inequality 



sup 

fk€^o(E) 



llpMlltv 

IlMlItv 



>0(M) 



By homogeneity arguments, we only need to prove (4.9) for any signed 
measure /i with fi(E) > 0. We use the decomposition = with 







4.2 Contraction Properties of Markov Kernels 129 



Notice that Ji has a natural Hahn-Jordan decomposition 

/**-<*- {=>IM,.=)<(E)/2 and IISII,, = (*-(£)) 
One advantage of this decomposition is that 

ll/illtv = (/!+(£;) +M-(£^))/2=ll/i||tv + Pltv 

This implies that 

IlMMIItv < II^MlItv + llMMIItv 
< IlMlItv + WPlltv 

= ||/l||tv + ^(M)(||Ml|tv-||P||tv) 

and finally we get ||/iM||tv < 0{M) ||^||tv + (1 - /9(M)) fi{E)/2. We now 
come to the proof of (4.11). Using the representation (4.4), we obtain 

0(M) = sup ||M(x,.)-M(y,.)|U 

x,y^E 

= sup sup{lM(/)(x) - M(/)(j/)| ; / € Osci(F)} 

x,yeE 

= sup{sup|M(/)(x) - M(f){y)\ ; / € Osci(F)} 

Xyy 

This ends the proof of (4.11). By the definition of 0{M), the proof of (4.12) 
is a simple consequence (4.6) of Lemma 4.2.2. This ends the proof of the 
proposition. ■ 

We are now in a position to state the main result of this section. 

Theorem 4.2.1 For any pair of probability measures p and v € V{E) and 
for any Markov kernel M from E into F, we have the contraction estimate 

H(pM,i/M) < /?(M) H{p,o) 

The proof of this theorem is based on a key technical lemma that provides a 
strategy to compare integrals of convex functions on R^. Before stating this 
result, it is convenient to examine the scalar case. This result is essentially 
Lemma 3.3 in [60] (see also Exercise 249 in [174]) but for noncompactly 
supported measures. In comparison with these two referenced works, oiu 
strategy of proof here is to use monotone convergence arguments instead 
of uniform convergence. 

Lenuna 4.2.3 Let mi, m 2 be two bounded measures on the Borelian real 
line (R, H) admitting a first moment and such that 

• mi and m 2 are acting in the same manner on affine mappings: 
mi(R) = m 2 (R) and ftmi{dt)= f tm 2 {dt) 




130 4. Stability of Feymnan-Kac Semigroups 

• For any s £Ht, J |t - s| mi{dt) < j |< - s| m 2 {dt). 

Then for any convex function h', we have mi (ft') < m 2 (ft') (the value +oo 
is not excluded). 

Proof: 

Let ft' be a given convex function on R. One can find two two-sided se- 
quences (xi)jgz* and (fti)tez* of nonnegative reals such that if we denote 
for n > 1 and t € R 

Kii^) = h'{0) + d'*'h'{0)t+ ^ ki{t-Xiy+ ^ ki{t + Xi)~ 



where 9+ft' is the right derivative of ft', then (ft(,)n>i is an increasing 
sequence converging towards ft'. To see this claim, we note that f G R •-> 
d^h'{t) - d'^h'{0) is nondecreasing and we have 

ft'(t) = h'{0) + d+h'{0)t+ f\d+h'{s)-d+h'{0))ds 

Jo 

Then we approximate from below the latter function by nondecreasing 
step functions (for instance, constant on appropriate dyadic intervals) to 
conclude at the desired convergence. Coming back to mi and m 2 , we note 
that for any i > 1 

j ki{t - Xi)'*’ mi(df) = \ j ^<1 mi(dt) + ^J ^<(* ~ ^i) ^i{dt) 

< ^<1 ‘^2{dt) + ^j ki{t - Xi) miidt) 

= j ki{t - Xi)^ m2{dt) 

In the same way, for nonpositive parts, we find that 

J k^i{t + x-iy mi{di) < J k-i{t + x-i)~ m2{dt) 

This implies that for any n > 1 

j h'„{t)mi{dt) < j h'n{t)m2{dt) 

We end the proof by letting n tend to infinity and using the monotone 
convergence theorem. 



In the present form, the previous lemma would only imply Theorem 4.2.1 
for probabihties satisfying /x < i/, so let us modify it a little: 




4.2 Contraction Properties of Markov Kernels 131 



Lemma 4.2.4 Let mi, m 2 be two bounded measures on the Borelian quad- 
rant admitting a first moment and such that 

• mi and m 2 are acting in the same way on affine mappings with 

mi(R2) = m2(Ri), 



J smi{ds,di) = J sm 2 {ds,dt) and j tmi{ds,dt) = j tm 2 {ds,dt) 

. For any a,6 € R, / |as - 6f| mi{ds,dt) < J \as- M\ m 2 {ds,dt) 

Then for any convex and homogeneous function h on R^ we have the in- 
equality mi{h) < m 2 (ft) (the value +oo is again not excluded). 



Proof; 

We recall that any convex and homogeneous function ft on the product 
space R^ has the form (4.3) for some convex function h* on R+. Using 
Lemma 4.2.3, we find that we simply need to check that for all a, 6 € R, 

j (as-bt)'^ mi(ds,dt) < j {as-bt)'*' m2{ds,dt) 

J ^ J l{t=o}S ^2{ds,dt) (4.13) 

The last inequtdity is needed for the cases where I > Iq. Note that this result 
can be deduced from the first condition by letting b -¥ oo with a = 1. Fi- 
nally, under our assumptions, a simple subtraction shows that (4.13) holds 
true. ■ 



Now we come to the proof of the theorem. 



Proof of Theorem 4.2.1: 

Let 'P(E) be given and let A G M(B) be a dominating measiure. We 
apply Lemma 4.2.4 to the measures mi and m 2 on (R^,'7^®^) defined for 
any ft G Bb(R^) by the formulae 



m 2 (ft) = m) I ft(^,^)dA+ (l-;0(A/))ft(l,l) 



The first condition of Lemma 4.2.4 is immediate, and the second one 
amounts to proving that for all a, 6 G R, 



/ 

<m) I 



dfiM ^^dvM 



dXM 

dv 
dX 



du + {l-l3{M)) |o-6| 




132 4. Stability of Feynman-Kac Semigroups 



In other words, in terms of the total variation distance, we need to check 
that 



l|(a/i - bu)K\\ty < I3{M) ||aM - 6i/||tv + (1 - 0{M)) |a - b\ /2 



which is clear from Proposition 4.2.1 since (a/i — bv){E) = a-b. ■ 



4.3 Contraction Properties of Feynman-Kac 
Semigroups 

In this section, we discuss the contraction properties of the nonlinear semi- 
group $p,n presented in Section 2.7 with respect to the /i-relative entropy 
criteria introduced in Section 4.2. We recall that is the nonlinear mapn 
ping from V{Ep) into V{En) defined by 

j, 

In the study of regularity properties of the following notion will 
play a major role. 

Definition 4.3.1 Let (E^S) and (F,^) be a pair of measurable spaces. 
We consider an h-relative entropy criterion H on the sets V{E) and V{F). 
The contraction or Lipschitz coefficient /3 h{^) G R+ U { 00 } of a mapping 
$ : V{E) -4 V{F) with respect to H is the best constant such that for any 
pair of measures /x, 1 / € V{E) we have 

< ph{^) hm 

When H represents the total variation distance, we simplify notation and 
sometimes we write /?($) instead of Ph{^)- 

When H is the total variation distance, the parameter y3($) coincides with 
the traditional notion of a Lipschitz constant of a mapping between two 
metric spaces. In addition, for linear mappings, it coincides with the Do- 
brushin ergodic coefficient defined in Section 4.2.2. 

One of the main objectives of this section will be to estimate the contrac- 
tion coefficients 0H{^p,n) of the nonlinear Feynman-Kac transformations 
$p,„. By the semigroup property and the definition of the contraction co- 
efficient, we start by noting that for any 0 < pi < P 2 < R we have 



0H{^pi,n) < Ph{%i,P2) ^H(^pj.n) 




4.3 Contraction Properties of Feynman-Kac Semigroups 133 



Such arguments are powerful tools for the study of the asymptotic stability 
properties of the semigroup $p,n- For instance, for any pair of measures 
with (/ip, i/p) < CO, we can check that 

3n G N : ^//($p^p^n) ^ 1 — ^ bm /f($p^yi(/ip),$p^p.fn(^p)) = 0 

n—¥oo 

Before getting into further details, it is instructive to note that 
may have completely different kinds of asymptotic behavior. We examine 
hereafter two “opposite” situations. When the potentiad functions G„ are 
constant functions, then we have Pp,„ = Mp,„. In this case, the asymptotic 
stability properties of $p,n are reduced to that of Mp,n- On the other hand, 
if the semigroup M„ = Id, then we also have Pp,„ = Id. In this situation 
j3(Pp „) = 1, and one cannot expect to obtain uniform stability properties. 
For instance, in the homogeneous case E„ = E with a potential Gn = 
associated with a nonnegative energy function V, the semigroup $p,n can 
be rewritten as 

„(e-("-p)v f) 

%MU) = %.M!) = .-ri/) 

It is then easily seen that $p,n(/^) tends as n oo and in a narrow sense 
to the restriction of /x to the subset 

V* = {x e E ; V{x) = /X - essinf y} 

Exact calculations as in previous examples are in general not possible, 
and the question of the regularity properties of general Feynman-Kac semi- 
groups is a diiBScult nonlinear problem. In the present section, we design a 
semigroup approach based on the Markov contraction analysis developed in 
Section 4.2.2 to give some partial answers to this question. The first central 
idea is to use the alternative description of the semigroup $p,n presented in 
(2.37), Proposition 2.7.1. More precisely, we recall that is alternatively 
defined by the equation 

^p,n(Mp) ~ ^p,n(Mp)‘Pp,n (^*1^) 

The Boltzmann-Gibbs transformation ^p^n from V{Ep) into itself is defined 
by 

^P,n(/ip)(da:p) = Mp(<^p) with Gp,„ = C?p,„(l) 

and the kernel Pp,„ can be regarded as the transition from Ep into En of a 
nonhomogeneous Markov chain from p to time n with transition semigroup 
(-^? 9 )o<p< 9 <n- More precisely, we have that 

Pp,„ = PWP,,„ with P(^)(/,) = Gp,,(/,G,.„)/Qp.,(G,.„) 

The next proposition expresses the fact that the Dobrushin ergodic coeffi- 
cient P{Pp^n) of the Markov kernel Pp,n is a measure of the oscillations of 
the mapping $p,n with respect to the total variation distance. 




134 4. Stability of Feynman-Kac Semigroups 



Proposition 4.3.1 For any 0 < p < n, we have 



0{Pp,n) 



gyp ll^p,n(Mp) ^p,n(t'p)||tv 
ftp,Up€P{Ep) ll^p.n(Mp) ~ ^p,n(t'p)lltv 
sup ||^p,n(A‘p) - ^p,n(lV)||tv 

Hpyl/p£V{Ep) 



(4.15) 



In addition, for any h-relative entropy H and for any € V{Ep), we 
have 



Hi^p,n{Pp)i^p,n{f^p)) ^ 0{Pp,n) H{^p,n{Pp)i^p,n{f^p)) (4-16) 

Proof: 

To establish the first assertion, we note that for any Xp, j/p G Ep we have 

^p,n(<^ip) — Pp,n(Xpi •) ll^ip ~ *^Vpl|tv = 1 

from which we conclude that the two terms in the r.h.s. of (4.15) are 
greater than (3{Pp^n)- The reverse inequality is a simple consequence of 
(4.14) and Proposition 4.2.1. The final assertion is again a consequence of 
Theorem 4.2.1 and (4.14). ■ 



The inequality (4.16) is of course not sufiScient to estimate /?// (4^p,n)> but 
it already shows a natural relation between PH{^p,n) &ud the Dobrushin 
coefficient 0{Pp,n) of Pp,n. More precisely, from (4.16) we find that 

0Hi%,n)<0{Pp,n)0H{%,n) 

This inequality is one of the cornerstones of the forthcoming analysis. It 
imderlines the two different roles played by the Markov kernel Pp,„ and 
the nonlinear Boltzmann-Gibbs transformation ’^p,„ in the estimation of 

0H{^p,n)- 

The rest of the section is decomposed into three parts. In the first part, 
we derive some “local” functional concentration inequalities. We present a 
class of /I'-divergence criteria with respect to which the semigroup $p,„ is 
locally Lipschitz. We illustrate these results with several examples of con- 
centration inequalities with respect to the Boltzmann, the HavrdarCharvat, 
the Hellinger, and the L 2 relative entropies presented in Section 4.2. The 
second part of this section is concerned with uniform concentration esti- 
mates. We propose a series of sufficient conditions under which the local 
concentration inequalities can be tmned into uniform Lipschitz inequalities. 
In the third and last part of the section, we use these results to estimate 
the contraction parameters 0H{^p,n)- 

4.3.1 Functional Entropy Inequalities 

The next theorem is the main result of this section. It provides a way 
to estimate the local i/-contraction properties of the semigroup $p,„ in 




4.3 Contraction Properties of Feynman-Kac Semigroups 135 



terms of the Dobrushin coefficient 0{Pp,n) the relative oscillations of 
the potential functions Gp,n- 

To describe these functional inequalities precisely, it is convenient to 
introduce some additional notation. When H is the h'-divergence associated 
with a differentiable h' G C^(R+), we denote by Ah the function on R* 
defined by 



Ah(t, s) = h'{t) - h'{s) - dh'{s) {t - s) (> 0) 



where dh'{s) stands for the derivative of h' at s G R+. We will also use the 
growth condition 

i'Ha) V(r, s, t) G R+ we have Ah(rt,s) < a(r) Ah{t,0{r,s)) (4.17) 

for some nondecreasing function a on R+ and a mapping 0 on R^ such that 
for any r G R+, 0{r,R+) = R+. 

Theorem 4.3.1 For any 0 <p<n and pp, Up G V{Ep), we have 



II^P.n(/*p) ^p,n(^'p)||tv < 0{Pp,n) 



l|G, 



P,n || 06 C 



^pi^p.n) V ^^p{Gp^n) 



IIMp ^plltv (4.18) 



In addition, for any h' -divergence H satisfying the growth condition {H)a 
for some nondecreasing function a, we have 



-ff(^p,n(Mp)> ^p,n(*^p)) ^ 0{Pp,n) 



l|gp,n|| 

I^p(Gp,n) 



a 




^p) 

(4.19) 



The proof of the theorem is a simple consequence of Proposition 4.3.1 and 
the next technical lemma. 



Lemma 4.3.1 Let G be a strictly positive and measurable function on 
some measurable space {E,E). We associate with G the Boltzmann- Gibbs 
transformation ^ from V{E) into itself defined by 



9{n){dx) = G{x) p{dx) 



For any £ V{E), we have 

In addition, for any h^ -divergence H satisfying the growth condition {'H)q 
for some nondecreasing function a we have 




136 4. Stability of Feynman-Kac Semigroups 



Proof: 

To prove the first assertion, we use the decomposition 

®M(/) - *M(/) = ^ mIG (/ - »M(/))| 

for any n, v € V{E) and / € Bb{E). Since we have 

G{x) {fix) - <Hiu)if)] - G{y) [/(y) - n^){f)] 

= [G(x) - G(y)l [fix) - «'(i/)(/)] + Giy) [fix) - /(y)] 



we find that 

0SC(G[/ - 9iv)if)\) < ||G||oec 0SC(/) 

Suppose next that H is an /I'-divergence satisfying the assumptions of the 
theorem. Jimtt/, then we have and by Proposition 4.3.1 the 

result is trivial. Then suppose /i < i/. To prove the result, it is convenient 
to use the variational representation of H on ViE) 



It is a simple exercise to check that the infimum is attained at s = 1 and 



Using this representation, we notice that 
Under our assumptions, we find that 

Ah i^,6(r,s)) iy 



with r = t/(G)//x(G). This clearly ends the proof of the lemma. ■ 



Corollary 4.3.1 For any 0 < p < n and Pp,Up e ViEp), we have the 
following contraction estimates: 

• Boltzmann relative entropy 



Ent($p,n(/ip) I ^p.n(^'p)) ^ 0iPp,n) 



l|gp,n|| 

PpiGp^n) 



Entifip,Vp) 




4.3 Contraction Properties of Feynman-Kac Semigroups 137 



• Havrda-Charvat entropy of order a> I 

a 



a(^p,n(Mp))^p,n(*^p)) ^ 0{Pp,n) J' Ca(/ip,^'p) 
• Hellinger integrals of order a G (0, 1) 



• li 2 -relative entropy 
2 






2,^p,n(*'p) 



<B(P \ II^P'”II / t^p{Gp,n) \ I MMp . 

U(Gp.n)j \\du. 



2 

2,1/p 



Proof: 

The proof of all of these functional inequalities amounts to a check that 
the respective convex functions h' satisfy the growth condition stated in 
Theorem 4.3.1. The Boltzmann entropy corresponds to the situation where 
h'{t) = tlogt. In this case, we notice that 

A/i(t, s) = t log(t) - s log(s) - (1 + log(s)) (f - s) 

= t log(t) - (t - s) - 1 log(s) = t log(t/s) - {t- s) 

from which we find that 

Ah(rt,s) = r [tlog(t/(s/r)) -(t- (s/r)) ] = r Ah{t,s/r) 

We conclude that (4.17) is met with a(r) = r and 0{r, s) = s/r. This clearly 
ends the proof of the first estimate. For the Havrda-Charvat entropy of 
order a > 1, we have h'{t) = - !)• In this case, we notice that 

(a - l)Ah{rt, s) = r“ - (s/r)“ - a (s/r)“"‘ (t - s/r)) 

= r° {a - l)Ah{t,s/r) 

from which we conclude that the growth condition (4.17) is now met with 
o(r) = r“ and 0(r, s) = s/r. This ends the proof of the second estimate. The 
Hellinger integrals of order a € (0, 1) correspond to the choice h'{t) = t-t°. 
In this situation, we observe that 

Ah{t, s) = t-t° - s + s° - {I- as““*) {t - s) 

= a {t — s) 

from which we conclude that 



Ah{rt, s) = r“ Ah{t, s/r) 




138 4. Stability of Feynman-Kac Semigroups 



Arguing as above, we conclude that the growth condition (4.17) is met with 
the same parameters. The proof of the third estimate is now completed. 
The final one corresponds to the case h'{t) = {t- 1)^. Since we have 

A/i(t, s) = (< - 1)2 - (s - 1)2 - 2(s - 1) (t - s) 

= (t - s) [(t + s - 2) - 2(s - 1)] = {t- s)2 

we find that (4.17) is met with a(r) = r* and again 6{r, s) = s/r. This ends 
the proof of the corollary. ■ 



4.3.2 Contraction Coefficients 

The local functional inequalities presented in the preceding theorem show 
the way to estimate the uniform contraction coefficients /3H{^p,n)- To fix 
the ideas, we recall that 

%($p,n) < 0{Pp,n) (4.20) 

Our imm ediate objective is to connect more precisely the contraction co- 
efficient 0H{^p,n) of the Boltzmann-Gibbs transformation with the 
relative oscillations of the potential functions Gp,n- 
In the further development of this section, H represents any /I'-divergence 
H satisfying the growth condition ('H)o stated on page 135 or the total 
variation distance on ViE). To unify the exposition, it is convenient to 
introduce the following terminology. 

Definition 4.3.2 For any h' -divergence H satisfying the growth condition 
{H)a for some nondecreasing function a on R+, we denote by an the func- 
tion on R+ defined by dnir) = r o(r). When H is the total variation 
distance, we denote by dn the function on R+ defined by o/f (r) = 2r. 

Proposition 4.3.2 For any 0 < p < n, we have the estimates 

0Hi^p,n) ^ O.H{rp,n) fi{Pp,n) with Tp_,j = 8\ip (Gp^ni^p)/Gp,n{yp)) 

®p.Vp 



Proof: 

When /f is an h'-divergence H (satisfying the growth condition (’W)o), the 
estimate stated in the proposition is a simple consequence of Lemma 4.3.1. 
We next examine the situation where H is the total variation distance. 
Applying again Lemma 4.3.1 to G = Gp,n, we find that 






II^P,n||o8C 
infsp Gp,„ 



Since we have ||Gp,p+„||o8c/||Gp,p+„|| = (1 -I- osc(Gp,p+„/||Gp,„||)) < 2, we 
conclude that 



0H{'^p,n) ^ 2fp^n 




4.3 Contraction Properties of Feynman-Kac Semigroups 139 



In view of (4.20), this implies that 0H{^p,n) < 2rp,„ /?(Pp_„), and the proof 
of the proposition is completed. ■ 



After these preliminaries to properly describe the relative entropies and 
the varioxis parameters that we are using, our next objective is to find a 
set of sufficient conditions on the semigroups Mp,„, Qp,n, and I^q and on 
the potential function under which we have 

lim l3{Pp,n) = 0 and sup (Gp,„(xp)/Gp,„(i/p)) < oo 

ip.Vp.P<n 

We will investigate these questions in terms of three regularity conditions 
on the semigroups (Mp^g,Qp^g,R^g). We say that a given semigroup /p,, 
satisfies condition (I)m when we have for some integer parameter m > 1 
and some sequence of numbers Cp(/) 6 (0, 1) 

Ip,p+mi^pi •) ^ ^pi^) Ip,p+m{ypt •) 

for any (xp, j/p) 6 and p € N. In this notation, the semigroups Mp,, and 
Qp^g satisfy respectively conditions (M)m and (Q)m when we have for any 
p € N and any pair (xp, Pp) G E^ 

^p,p+m(^p> •) ^ ^pi^) -^p.p+mCyp) •) 

and Qp,p+m(^pi •) ^ ^piQ) Qp,p+m{ypt •) 

With some obvious abusive notation, the semigroup Rp"g satisfies condition 
when we have for any 0 < p 4- m < n and any pair (xp, Pp) 6 Ep 

^ fp(^^"^)«S+m(l/p,.) 

We will also assume frequently that the potential functions Gn satisfy the 
condition (G) stated on page 115, for some Cn(G) > 0. For any 0 < p < n, 
we also use the notation 



^P,n(G) - JJ €k{G) 

p<k<n 



In this notation, we notice that = 1 and e„,„+i(G) = en{G). 

In the next proposition, we have coUected some key implications between 
the mixing conditions above as well as some estimations of mixing param- 
eters. We also underline that condition (Q)m ensures a imiform control on 
the relative oscillations of the nonhomogeneous potential functions Gp,„. 
As mentioned above, these results will be of constant use in the forthcom- 
ing analysis of Feynman-Kac semigroup stability and the convergence of 
particle methods. To emphasize the role of the forthcoming estimates we 




140 4. Stability of Feynman-Kac Semigroups 



recall that that the Dobnishin ergodic coefficient is an operator norm (see 
Proposition 4.2.1), that is we have 

Lfl/mJ-l 

^{Pp,p+g) ^ n ^(■^p+fcm+l,p+(fc+l)m) 
fc =0 



Proposition 4.3.3 If condition {Q)m ts satisfied, then is also 

met, and for any 0<p + m<n we have 

<,(«<"') ><?(« ««<i 

In addition, for any (xp,j/p) G we have the uniform estimate 

Gp,n(^p) ^ ^p(Q) ^p,n(j/p) (4*21) 

When {G) and {M)m satisfied, then (Q)m is met and we have 
€p(g) > ep,p+„,(G)£p(M) and < 1 ~ fp+i.P+m(G)e2(M) 



Proof: 

The proof of the first implication is very simple. Suppose {Q)m holds 
true. Then, for any nonnegative function /, G Bb{Eq) and ip,i/p G Ep, 
0<p + m<q<n,we have 






Qp,qifq^qin)i^p) ^ 
Qpjqi^q^n){^p) 



4(Q) R^;:^ifq)iyq) 



This ends the proof of the first assertion. To prove (4.21), we observe that 
for any 0 < p -h m < n 



Gp^nj^p) _ Qp,n(l)(3^p) _ Qp,p4‘m(G^p-fm,n)(^p) ^ /^x 

GpAVp) Qp.n(l)(yp) Qp,p+m(Gp^m.n)(yp) ' 



To prove the second assertion, we observe that 



Qpiqifq){^p) 

Qpyqifq){yp) 



GpjXp) AfpH-l (Qp-t-l,g(/g))(^p) 

Gpivp) A^p+i(Qp-i-i,fl(/(g))(yp) 

-^p-n(Gp4.iAfp^2Qp4-2,q(/g))(^p) 
Afp-M (Gp+I Mp4.2Qp-h2,g (/g)) (l/p) 

MpyP -^2{ Qp -\-2, q { fq )){ Xp ) 

-^p,p-f 2 (Qp+2,g (/(?)) (j/p ) 



€p(G) 

6 p(G) €p^i{G) 



Using a simple induction, we conclude that 



Qpiqifq)i^p) ^ ^P,p-^rn{Qp-\-m,q{fq)){^p) 

QpMiyp) Mp^p^yr.{Qp^mMq)){yp) 




4.3 Contraction Properties of Feynman-Kac Semigroups 141 



Finally, we get 



Qp,q{fg)i^p) 

Qp,qifq){yp) 






and we conclude that Cp(Q) > €p^p^m{G) Cp(Af ). To estimate the Dobrushin 

coefficient l3{R^p^yn)y observe that, for any Xp,j/p € Ep and for any 
nonnegative bounded measurable function /p+m on Ep^rn^ we have 



•^p,p+m(/p-fm)(^p) ^ €p+l,p-fm(^?) ^p(^) 



-^p,p-t-m(/p-t-mG^p-t-m,n)(yp) 
^p,p-^m {Gp-^rn,n ) (j/p) 



We end the proof using the representation (4.12) in Proposition 4.2.1. ■ 



For later use, we have collected in the next two corollaries some simple 
consequences of the preceding proposition. In the first one, we provide 
uniform and explicit controls on the contraction coefficients of ^^p,n* The 
second corollary presents a series of estimations of the Dobrushin coefficient 
W,n). 

Corollary 4.3,2 For any n > m > 1 and p € N, the following conditions 
are satisfied: 

{Q)m =► M^p,p+n)<a„{e;\Q)) 



(G)m and (MU => 0Hi%,p+n) 



Corollary 4.3.3 For any 0 < p q < n, we have the following series of 
estimates and implications: 



(Q)m 

(G) and {M)m 



0{Pp,p+,)< n 

fc =0 



0{Pp,p+q) — 






it=0 



U/mJ-1 

0(pp,p+q) - ri ( 



k=0 



l-€ 



(m) 

P+fcm 




with ejr\G, M) = €p(M) €p+i,p+m(G). In addition, we have that 

n-1 

(MU with m = 1 =» 0{Pp,n) ^ n ~ ffc W) (4-22) 




142 4. Stability of Feynman-Kac Semigroup)s 



To emphasize the improvements we obtain in strengthening each mixing 
condition, it is instructive to examine the time-homogeneous situation. We 
suppose the various mathematical objects are homogeneous with respect to 
the time parameter, and we suppress the time index in the notation. When 
condition (Q)m is met, we have proved that 

/3(P0,nm)<(l-«W” 

Suppose now conditions (G) and (M)m are met. By (4.21) we have in this 
case e{Q) > €”*(G)e(M), and from a previous estimate we find that 

W.nm)<(l-e'”*(G)c2(M))’‘ 

Nevertheless, under this stronger condition, it is more judicious to estimate 
directly. As stated in corollary 4.3.3, we find that 

m.nrn)<(l-e(”*-')(G)e2(M))” 

When the miying condition {M)m holds true for m = 1, we even get a 
potential free estimate of the contraction parameter 

/3(Po,n)<(l-e'(M))" 

4-3.3 Strong Contraction Estimates 

Theorem 4.3.1, Proposition 4.3.3, and Corollaries 4.3.3 and 4.3.2 are pow- 
erful weapons to derive several strong contraction estimates. As in Sec- 
tion 4.3.2, in order to unify our statements, we denote by H the total vari- 
ation distance on V{E) or any /I'-divergence satisfying the growth condition 
('W)oj stated on page 135. We recall that for these two classes of distance- 
like criteria, 5 h is the function on R+ defined respectively by o/f(r) = 2r 
and anir) = r o(r). (see Definition 4.3.2). 

Proposition 4.3.4 We suppose condition {Q)m is met. Then for anyp G N 
and n>m, we have the contraction estimates 



Ln/mJ-l 



||^p,p+n(Mp) ^p,p-l-n(^p)||tv ^ 

k=0 


(l-fp+fcn.(Q)) 


(4.23) 


and 

Ln/mJ- 

/? h (^ p , p + ti ) < Yi 

k=zO 


1 

(l-fp+fcm(^3)) 


(4.24) 


We can improve the preceding inequalities by strengthening condition {Q)m. 



Using Theorem 4.3.1, (4.21), and Corollary 4.3.3, we prove the following 
result. 




4.3 Contraction Properties of Feynman-Kac Semigroups 143 



Proposition 4.3.5 WTien conditions (G) and {M)m are met for some 
m > 1, then we have for any n > m 

KH-i 

/?H($p.p+n) < n (l - 4 +L(G, M)) ( 4 . 25 ) 

k=0 



with = ejiM) €p+i,p+m(G). 

Proposition 4.3.6 Suppose the mixing condition (M)„, is satisfied with 
m = 1. Then we have for any n > 1 



^H(^p,p+n) < d„{e;\G)e;HM)) H (l " ( 4 . 26 ) 

fc =0 

and the potential free estimates 

n -1 

II^P»P+n(Mp) “ ^p,p-|-n(i^p)||tv — JJ (4-27) 

fc =0 

Corollary 4.3.4 Assume that conditions (G) and {M)m are met for some 
m > 1. For any (^p, t'p) € V^Ep)^ with H {ftp, i/p) < oo and p € N, we 
have 

E Cn(-A/)tn+l,n+m(G) =00^ lim /f($p,p+n(/^)> ^p,p+n(^'p)) — 0 

n—¥oo 

n>0 



In addition, if we have lim„.+oo ^ ]Cp=o €p(Af)ep+i,p+m(G) = e, then we 
have the asymptotic exponential decay 

lim - log H($p, p+„m (/ip), ^p,p+nm(t'p)) < 

n— foo ft 

Finally, if we assume that inf„>o€n(G) = e(G) and inf„>oCn(Af) = e(M)> 
then we have 



H($p,p+„m(/ip),^p,p+nm(i'p)) < c 6xp (-n e^(M)e^’" ^^(G)) 

for some finite constant c < oo whose values only depend on the pair 
(c’^(G),6(M)) and on the relative entropy /f(/Xp,i/p) between the two mea- 
sures (ip, i/p. 



It is instructive to examine the nature of the various contraction es- 
timates obtained so far for time-homogeneous Feynman-Kac semigroups. 
We again suppose the various mathematical objects are time-homogeneous 
and we suppress the time index in the notation. We restrict ourselves to 




144 4. Stability of Feynman-Kac Semigroups 



the situation where (G) and (M)m are satisfied for some m > 1. In this 
context, we have proved that 

< aff(e-\M) e-^(G)) (l - e^(M) e— ^(G))” 

We can alternatively quantify the contraction properties in terms of the 
first time n at which /?h(^o,t») < V® 

Ah($) = inf {n 6 N ; < 1/e} 

This should be thought of as a relaxation time after which the semigroup 
becomes contractive. Recalling that log(l - x) < -x, for any x € (0, 1), 
we find that 






l + loganCe-HM) e-"*(G)) 
e2(M) e^-^G) 



4.S.4 Weak Regularity Properties 

In previous sections, we studied the regularity properties of the semigroup 
^p,n with respect to a collection of relative entropies on the set of probar 
bility measiures. The present section is concerned with regularity properties 
with respect to the weak topology. More precisely, we want to estimate the 
measure of contraction of the mappings 

$p,n(.)(/n) : Mp G P{Er>) — ^ $p,n(/Xp)(/n) G R (4.28) 

where /„ e Bb{En) is a given test function. This problem is intimately 
related to a natural representation of the oscillations of (4.28). To describe 
this formula precisely, it is convenient to introduce another key integral 
operator related to the Boltzmann transformations 'Pp,n- 

Definition 4.3.3 For any 0<p<n and Pp G P{Ep), we denote by Qp^n 
the integral operator on M{Ep) and Bb(Ep) defined by 

In the next technical lemma, we have collected some important properties 
of 

Lemma 4.3.2 For any 0 <p <n and pp 6 ViEp), we have PpQ^fn = 0, 
and for any rjp G V{Ep) 

^p,n(Pp) ~ ^p,n(pp) = ^ J ^ “ f^p)Qp^n (^-29) 

In addition, for any fp G Bb{Ep), we have 



(4.30) 




4.3 Contraction Properties of Feynman-Kac Semigroups 145 



Proof; 

By the definition of Qp,n, we clearly have /XpQp,n = 0. Next we observe 
that 

[’®'p,n(*?p) ~ ^p,n(^p)](/p) 

= ^p,n{Vp)[fp ~~ '®’p,n(Mp)(/p)l 

= SSfel "p [sfcW" - = Sfel ’h^pMU) 

Since /ip^,n(/p) = 0, we cam also write 

[%M - ^P.n(Mp)](/p) = iVp - Mp]^fn(/p) 

This ends the proof of (4.29). Recalling the decomposition presented in the 
proof of Theorem 4.3.1 

( 2 p,n(/p)( 3 ^p) “ Qp,n(/p)(l/p)) 

= {Gp^ni^p) ~~ Gp,n(j/p)) ifpi^p) ” ^p,n(Mp)(/p)) 



+Gp,n(j/p) ifpi^p) “ fpiVp)) 

and the end of the proof of the lemma is straightforward. ■ 

Using the formula (4.29), we find that 

^P,nM-$p,n(/Xp)= X (T/p-Mfi^fn^P-n 

Vp\^p,n) 

It is sometimes more convenient to write the display above as a second- 
order development: 

^p,n(^p) “■ ^p,n(Mp) ~ [Vp ““ Mp]2p»n^p,n 

“ (r \ t/^P ” ^pK^P,n) X [t]p — fJ>p]Qp^nPp,n 

Vp\^Pyn) 

Furthermore, using (4.30) and (4.11), we find that for any fn G Bb{En) 
OSc(^,-n^P.n(/n)) < /?(Pp,n) OSc(/„) 

and 

\\^:„Pp,n{fn)\\ < P{Pp,n) ||/n|l 

Sununarizing the discussion above, we have proved the following proposi- 
tion. 




146 4. Stability of Feynman-Kac Semigroups 



Proposition 4.3.7 For any 0 < p < n, Hp e V{Ep), and fn € Bb{En) 
with osc(/„) < 1, respectively ||/„|| < 1, there exists a function fp'^n^ in 
Bb(Ep) with osc(/p|Jf^) < 1, respectively ||/p(n^|| < 1, such that for any 
Tfp € P(Ep) we have 



ll«,.nW - ».,.M(/n)l < «Pp,n) l(>)p " «p)(/fe’)l 

and respectively 

W^pAvp) - %M\{fn)\ < 0{Pp,u) \{vp - 



4.4 Updated Feynman-Kac Models 



The study of regularity properties of the updated Feynman-Kac semigroups 

$p,„ : P(Ep) P{En) 



can be carried out along the same line of argiunent as the one used in 
previous sections. We will of course not rewrite the whole analysis, but we 
indicate the precise way to transfer these results. 

First we notice that $p,„ and $p,„ have the same structural proper- 
ties. More precisely, the description of $p,n in terms of the updated linear 
semigroup Qp,„ coincides with that of $p,„ by replacing Qp,„ by Qp,n- To 
illustrate this assertion, we recall that <>p,„ can alternatively be written as 

♦p..(%)(/p) = = ®p,n((^)n.,. 

Pp{Qp,nW) 

with the Markov kernel Pp,„ from Ep into En and Boltzmann-Gibbs trans- 
formation ^'p,„ on V(Ep) associated with the potential Gp,„ = Qp,n(l) and 
defined by 



hM 



Qp,n{fn) 

Qp,n(l) 



and 



Gp^nj^p) 

/^p(^P,n) 



^ip{dXp) 



To take the final step, we recall that the updated Feynman-Kac fiow asso- 
ciated with the pair (Gn» Mi) can be regarded as the prediction fiow model 
associated with the pair (Gn,M„) with 



Gn = Mn+l(Gn+l) and Mn{fn) = MMnGn)/Mn{Gn) 

In addition, using the fact that Qnifn) = Mn{Gnfn) = Gn-i Mn(/n), we 
see that the updated semigroups Qp^n arc defined as the prediction semi- 
groups Qp^n by replacing the pairs (Gn, Mn) by the pair (Gn, Mn) (with the 




4.4 Updated Feynman-Kac Models 147 



same labeling indexes). Prom these two simple observations, we conclude 
that the whole analysis derived in the previous sections is valid for the up- 
dated semigroups by replacing the quantities (Gp,n,^p,n» Qp,n>^p,n) and 
the pairs (Gn, Afn) by the corresponding quantities (Gp,n» Pp,m Qp,m ^p,n) 
and the pairs (Gn, Mn). 

It is instructive at this stage to give an example of the contraction prop- 
erties that can be transferred using this parallel. We denote by (G), (M)m» 
and (Q)m the mixing conditions on the updated objects defined as (G), 
{Mm), and {Qm) replacing by (G„,A^,Q„). For instance, 

the regularity condition (G) and mixing condition (M)m read: 

(G) : There exists a sequence of numbers e„(G) € (0, 1), n G N, such 
that for any (x„,y„) G we have G„(Xp) > e„(G) G„(j/p) >0. 

{M)m There exists a sequence of numbers €n{M) G (0, 1), n G N, such 
that for any p G N, and (xp,j/p) G E^, we have 

Mp^p+m{^p, •) ^ ^p{^) ^p,p+m{yp, •) 

In the display above, Mp,„ represents the Markov semigroup associated 
with the Markov kernels M„. Note that if (M)m is satisfied for m = 1, 
then (G) holds true with €n(G) > en{M). 

As usual, in order to unify the presentation, in this section we denote by 
H the total variation distance on V{E) or any h'-divergence satisfying the 
growth condition {H)a, stated on page 135. 

Proposition 4.4.1 When conditions (G) and {M)m are met for some 
m > 1, then we have for any n>m 

M%,p+n)<aH{e;HM)e-l^miG))Yl (l - M)) (4.31) 

fc =0 

with e’T\G, M) = eliM) €p+i, p+m(G). 

This way of transferring results from the prediction to the updated models 
can be extended in a natural way to situations where the potential functions 
Gn are not strictly positive. In this situation, we write En = G~'(0,oo). 
The main assumption in this context is the accessibility hypothesis (A) 
introduced in (2.16) on page 67. Rephrasing the discussion given in Sec- 
tion 2.5, the main advantages of this condition are that M„+i are well- 
defined Markov kernels firom E„ into E„+i and the potentieJ functions 
Gn are strictly positive on En- If we make this assumption, then #p,n are 
well-defined semigroups on the sets ViEp) and the whole analysis can be 
conducted as before by replacing En by En- In this context, conditions (G) 




148 4. Stability of Feynman-Kac Semigroups 

and {M)m take the form 

(G) : There exists a sequence of strictly positive numbers en(G), n 6 N, 
such that 



V(x„,y„)eE^, Gn(xp) > €„(G)Gn(Pp) >0 

(M)m '■ There exists a sequence of strictly positive numbers e„(M ), n € N, 
such that 

Vp € N , V(Xp, Pp) € Ep , Mp^p^rn{Xp, •) ^ €p{M) Mp^p+TniVpt •) 

The next three examples illustrate situations where conditions (G) and 
{M)m are met on the “restricted spaces” E„ but not on En- 

Example 4.4.1 Suppose that the state spaces are homogeneous with E„ = 
Z**, d > 1, and Mn = M is the Markov transition on Z** defined by 

M{x,dy)= p{e) 6x+e{dy) 

eez<‘ ■■ |e|<l 

with Prain = inf|e|<ip(e) > 0, and |x| = vf=i|x’| for all x 6 We take 
the indicator potential function G = Ig associated with the set E = {x G 
Z** ; |x| < q), where q is a given strictly positive integer. Since we have for 
any x £ E and |p| > 9 + 1 

G(x) = M(x, E) > p(0) >0 and G(y) = 0 

we see that condition (G) is not satisfied on the whole lattice but it holds 
true on E with e{G) > p(0). Furthermore, the kernel M defined for any 
X £ E by 



dy) = / X V p(e) lg(x + e) S^+ddy) 

e^zd , |e|<i 

is a well-defined Markov transition from E into itself. Notice that each 
coordinate x’ £ [-g,+9] of x — (x*)i<<<<j can be joined to any coordinate 
z* G [— 4-g] of z = {z*)i<i<d '^th an M -admissible path in E of maximal 
length 2q, Prom this observation j we find the rather crude estimate 

V(x, y, z) £ E^ M”*(x, {z}) > pZnM^'iy, {^}) 

with m = 2dq. We conclude that the mixing condition {M)m is not met on 
the whole lattice E but it holds true on E with c(M) = p^in- 




4.4 Updated Feynman-Kac Models 149 



Example 4.4.2 Again we assume the state space to be homogeneous, En = 
and let = M be the Gaussian transition on the real line defined by 

M{x,dy) = exp 

where a : R -4 R is a given Borel drift function. We let G = Ig be the 
indicator of a given Borel subset E CR and we suppose 

|E| = sup {|x| , X € ^} < 00 and ||a|| = sup {|a(x )| , x £ E} <oo 

In this situation, the Markov kernel M is defined on the whole real line and 
it is given by 




with 

M{x,E) = G{x) = exp^-^(y-o(x))2^ dy 

After some elementary computations, we find that for any (x,y,z) £ E we 
have 



log g [-c(o),c(a)] 

with the crude estimate c{a) < 2||a|| (|E| -h ||a||). This clearly implies that 

dM(y,.) M{x,E) dM{y,.y ’ 

We conclude that (G) and (M)m a,re satisfied with m = 1 on E with 
e(G)>e-‘=l“l and e(M) > 



For an unbounded drift function a at^ unbounded Borel set E, the reader 
will notice that conditions (G) and (M)m not met. 

Example 4.4.3 (One-dimensional neutron model) The following sim- 
plified neutron collision/ absorption model is taken from Harris [176]. We 
assume that = R and the pair (Gn, M„) = (G, M) is homogeneous and 
given by 



G(x) = 21 [ 0 , £,](*) and M(x,dy) = |e ‘'I*' ®ldy 




150 4. Stability of Feynman-Kac Semigroups 



where L>0 and oO are given constants. In this situation, we check that 
E = [0, L], and for any x € [0, L] we have 

G{x) = M(G)(x) = 2-(e-“ + e-‘=(^-*)) 

M{x,dy) = c G(x)"^ l[o,i]( 2 /) dy 

We now observe that G{x) € [1 - from which we conclude that 

conditions (G) and {M)m are met on [0, L] with m = 1 and 

e(G) = (1 - e-"^)/2 and e(M) = e-^e(G) 



In general, these two^conditions are not easy to check, mmnly because 
the updated kernels M„ are related to the potential function Gn- Our 
next objective is to give a sufficient condition in terms the reference 
p6iir (Gn, M„) under which the central mixing hypothesis {Q)m is met. We 
recall that in the context of updated semigroups we have 



P »,n — P^qPq.n with P^qifq) ~ Qp,q{fq^q,n)IQp,q{Gq,n) 



Proposition 4.4.2 Suppose conditions (G) and (M)m are met for some 
m > 1. Then, the mixing condition (Q)m « <dso met with 



fp(Q) ^ ^p(^) Cp+l,p+m(G’) 

and we have 0 <p + m<n and {Xp, yp) G Ep the uniform estimate 

Gp,n{Xp)/Gp^n{yp) ^ ^p{hf) €p+i,p+m(G) 



In addition, for any 0<p + m<q<n, we have 

< 1 - €p+i,p+„,(G) eliM) (4.32) 



Proof: 

For any nonnegative function /n G Bb{En) and Xp,yp e Ep,0 < p + m < n, 
we have 



Qp,n(/n)(^p) 

Qp,n(/n)(l/p) 



__ -^p+1 (G^p^lQp-hl,n(/n))(^p) 

Mp^l{Gp^l Qp^ l,n(/n))(j/p) 

-^p,p-f 2 {Gp-\-2Qp^2,n {fn)){^p) 

Afp,p+2 (^p+2 Qp+2,n (/n ) ) (j/p ) 

-^p,p-h3 (^p-i-3Qp-h2,n (/n) ) (^p) 

-W^,p+3 (Gp+3(3p4.2,n (/n)) (j/p) 



> €p+l(G) 



> fp+l(G^) ^p+2(G) 



Using a clear induction, we find that 



Qp,n(/n)(^p) ^ 
Qp,n(/n)(yp) 



r m— 1 



JJ fp+k{G) 

.fc=l 



^p,p-\-m{Gp^mQp-{-m,n{fn)){^p) 
-^PjP+m ( G'p+m Qp-i-myn{fn)){yp) 



4.4 Updated Feynman-Kac Models 151 



from which we conclude that 



Qp,n{fn){Xp) 

Qp,n{fn){yp) 



> ^p+l,p+m{G) tp{M) 



Using similar arguments, for any 0<p + m<g<nwe find that 






_ ^p-hl(^p-hlQp+ltq(fg^g,n))(^p) 
^p-hl(^p+lQp+l,g(Gg,n))(^p) 

^ g ^P<P+^iGp+2Qp+2,g{f qGq^n)){Xp) 

- Mp.p+2(Gp+2Qp+2,,(G,,„))(Xp) 



and by induction 



Mp,p-\-m \Gp-\-mQjH-m,g (G^,n ) ) {^pj 



k=l 



> €p+i „+M el(M) 

" ' Mf,,j,+m{Gp^mQp+m,g{Gg,n)){yp) 

The proof of (4.32) is now clear. This ends the proof of the proposition. ■ 



We end this section with simple contraction estimates that can be de- 
duced firom Proposition 4.4.2 using previous considerations (see also The- 
orem 4.3.4). 

Proposition 4.4.3 Suppose conditions (G) and {M)m are satisfied. Then, 
for any n>m, we have 



Ln/mJ-l 

M%,p^n) < J] (l-4+L(G,M)) 

fc =0 

(4.33) 

withe^\G,M) — Cp(M) tp+i^p+rn{G). When condition [M)m is satisfied 
with m = l, we have the potential free estimates 

/?H($p.p+n) < a„{e;\M)) ll{l-el^,{M)) 

ik=0 

Corollary 4.4.1 Assume that conditions (G) and (M)m dre met for some 
m > 1. For any {fip,Up) € V{Ep)^ with H(/Xp,i/p) < oo and p G N, Ti;e 
have 

y]en{M)en+i,n+m{G) = oo=> lim ff($p,p+„(/ip),$p,p+„(l/p)) = 0 

n— f oo 

n>0 




152 4. Stability of Feynman-Kac Semigroups 



In addition, if we have lim„->oo ^ Z)p=o hi^)^p+hP+m{G) = e, then we 
have the asymptotic exponential decay 

lim — log/f($p^p-|-nm(Mp)) ^p,p-|-nm(*^p)) ^ 

n-40o ft 

Finally, if we assume that mfn>ofn(G) = e(G) and inf„>oe„(M) = e(M), 
tiien we have 

H($p,p+nm(fip),^p,p+nm(i'p)) < c exp(-n e^(M)e^'"-^^(G)) 

for some finite constant c < oo whose values only depend on the pair 
{e'^~^{G),e{M)) and on the relative entropy H{pp,Vp) between the two 
measures Pp,Up. In particular, when m = 1, we have a uniform estimate 



H{ip,p+nm{fh>)>%,P+nm{t'p)) < c exp(-n e^(M)) 
with a finite constant c that does not depend on e{G) (nor on H{pp,Vp)). 

We finally examine the impact of these contraction results in the study of 
time-homogeneous models. It is also instructive to connect the forthcoming 
discussion with the one given at the end of Section 4.3.3 on the semigroups 
$p,„. As usual, we suppose the various mathematical objects are time- 
homogeneous and we suppress the time index in the notation. We also 
restrict oiu^lves to the situation where (G) smd (M)m are satisfied for 
some m > 1. In this context, we have proved the inequalities 

< 5„(e-^(M)e-(”-‘)(G)) (l - €"(M) e"-*(G))" 

(4.34) 



We finally introduce the relaxation time 

Ah($) = inf |n e N ; /3/f($o,n) < l/e| 



Using simple computations, we find that 



< m 



1-b log aH(c-i(M) £-(”‘-i)(G)) 
£2(M) €'"->(G) 



4.5 A Class of Stochastic Semigroups 

In this section, we use the Markov contraction estimates developed in Sec- 
tion 4.2 to study the stability of a class of stochastic Feynman-Kac models 
arising in nonlinear filtering problems. The sensitivity of filtering equations 
with respect to initialization errors consists in stud}dng the long time be- 
haviors of £m incorrectly initialized filter with the exact optimal filter. In 




4.5 A Class of Stochastic Semigroups 153 



Chapter 12, Section 12.6, we will see that the semigroup associated with the 
filter equation is a nonlinear Feynman-Kac semigroup. As a result, we can 
estimate the asymptotic stability properties of these nonlinear semigroups 
using the contraction estimates provided in Section 4.3 and Section 4.4. 

Under some rather strong regularity and mixing conditions on the pair 
(G„, M„) (see page 139), we have derived several contraction estimates with 
respect to a class of /»-relative entropy criteria. The aim of this section is 
to replace these conditions by a single hypothesis on the Dobrushin con- 
traction coeflBcient of M„. The strategy consists in combining the Markov 
contraction analysis presented in Section 4.2 with some entropy inequalities 
recently obtained by Ocone in (260) (see also [59]). 

We consider a sequence of measurable spaces (£„,5„), (F„, J^„), n G Z 
and a measurable nonnegative potential function Qn ■ E„ Fn (0, oo). 
We also suppose there exists a nonnegative measure q„ on F„ such that for 
any € En and n € N we have 

/ Qni^n^yn) 9n(dj/n) ~ 1 (4.35) 

JFn 

let Tfo € V{Eq), and let M„+i be a Markov kernel from En into £’n+i. We 
associate with these objects the Markov chain 



(fl, Q, Un = {Yn-l,T]n)n>0, P) 

taking values in (F„_i x En,Fn-i®En) with initial distribution 

with an arbitrary /i_i € P(F_i), and elementary transitions given for any 

F„ G Bb{Fn X V{En+i)) by the formula 



E(Fn(y„, 7 ?„+i) I (y„_i,77n)) 



~ / Fn{yni^n,y„{Vn)^n+l) QuiVni^n) Qnidyn) qn{dXn) 

J En'XFn 

For each y„ G F„, ^n,y„ • FiEn) F{En) is the Boltzmann-Gibbs map- 
ping associated with the nonhomogeneous potential function x„ ^ E„ 
9n{xn,yn) ^ (0,oo). That is, we have 



’^n,v„(*7n)(<^n) 



9n{yn,Xn) 

/ e „ 9n{x!n,yn)Vn{dx'n) 



VnidXn) 



Notice that the stochastic flow r}n satisfies a nonlinear and random recursive 
equation starting at tjo at time n = 0 



Vn+l = ^n,YniVn)Mn+l 

Let be an auxiliary model defined with the same random equation 

^n+l ~ ^n,Y„iVn)^n+\ 




154 4. Stability of Feynman-Kac Semigroups 



but starting at some possibly different t/q e V{Eq). 

By construction, it is also clear that the triplet n G N, 

forms a Markov chain taking values in {F„-\ x V{E^'^). In nonlinear fil- 
tering literature, the distributions »/„ and '4'n,y„(»M) = are called the 
one-step predictor and the optimal filter. The fiows and (>?n) = % 
are called the wrong initialized models. The distribution represents the 
initial distribution of a Markov signal, and the sequence Yj, represents the 
noisy and partial observation delivered by the sensors. In practice, is 
generally unknown and we traditionally initialized the filter or any kind of 
approximation scheme with a wrong initial condition. One important prob- 
lem is clearly to find sufficient conditions insuring that the filtering problem 
is well posed in the sense that it corrects any wrong initial condition. 

Next theorem is a simple application of the Markov contraction estimates 
developed in Section 4.2. 

Theorem 4.5.1 For any n G N, we have 



E(Ent(^„|0 < E(Ent(T?„l7?;)) < 



n 

-P=l 



Ent(»?o 1 n’o) (4.36) 



Proof: 

If 7J0 »/o> (4-36) is equal to oo and the second inequality 

is trivial. The l.h.s. inequality is also trivial as soon as t/„ 5^ ?/(,. Otherwise 
we notice that 



Since we have 



we find 



^^n.YniVn) _ VniOnj* y^n)) d/rjn 

d"^n,Yn{Vn) Vn{9n{>yYn)) dTj^n 



p I \ ^n(gn(»>^n)) . f i f dfln \ gn(«>yn) 

Enlfh I %) +/ ' >hb„(.,yn)) 



By construction, we have 
E(Ent(^„ I fTJ I (Tn-i,»7„)) 



+ / *08 f ) 9n(Xn,Vn) 9n(<fj/n) »?n(dx„) 

Je„xF„ \«»?n / 




4.5 A Class of Stochastic Semigroups 155 



Using (4.35), this implies that 
E{Ent{f}n 1 ffj 1 (y„-i,»?n)) 

= Ent(,/„ I - X 

On the other hand, applying the Fubini theorem and again (4.35), we also 
have 

/ »?n(5n(-,yn)) 9n(dyn) = 1 

JFn 

Therefore the term 

represents the relative entropy between two equivalent distributions on F„. 
Thus, we find the almost sure estimate 

E(Ent(^„ I ffj I (Tn-i, T?„)) < Ent(r,„ 1 Vn) (4-37) 

This ends the proof of the l.h.s. inequality in (4.36). By Theorem 4.2.1, we 
also have for any n > 1 

Ent(7j„ I rfn) = Ent(^„_iM„ | ^_iM„) 

< I3(M„) Ent(^„_, I ^„_i) (4.38) 



If we combine (4.37) and (4.38), we readily end the proof of the theorem. ■ 




5 

Invariant Measures and Related Topics 



5.1 Introduction 

One of the centred questions in the theory of time-homogeneous Feynman- 
Kac semigroups is the existence of invariant measures and the rapidity at 
which the memory of the initial distribution is lost. This section is centered 
around this theme. This question is related to different kinds of problems 
arising in physical, engineering, and applied probability. To guide the reader 
and give some concrete basis to this section, we have chosen to give a brief 
introduction on the different ways to interpret and to answer this question. 

In physics, the invariant measures may describe the limiting behavior of 
a nonabsorbed Markov particle evolving in a pocket of obstacles (see Sec- 
tion 2.5.1). In this connection, we also mention that the Lyapunov exponent 
of Feynman-Kac-Schrodinger semigroups and related spectral quamtities is 
also described in terms of these limiting measiures. In biology, Feynman-Kac 
models can be viewed as distribution flows of infinite population genetic 
algorithms (see Section 2.5.2). In this interpretation, invariant measures 
represent the asymptotic concentration of individuals or genes in a genetic 
population evolution model. In the preceding application areas, the pair 
of homogeneous potentials/kemels (G, M) is dictated by the problem at 
hand. The Markov kernel M may represent the physical motion of a par- 
ticle in some environment {E] £) as well as the mutation of individual or 
genes in some natural evolution process. In particle trapping problems, the 
potential G represents the absorption rate and the strength of the obstacles 




158 5. Invariant Measures and Related Topics 



in the medium. In biology, G is instead interpreted as the selection pressure 
of the environment. 

FVom a somewhat radically different angle, Fejmman-Kac semigroups 
can be thought of as a natural extension of Markov semigroups. To bet- 
ter understand this point of view, it is useful to rectdl that Feynman-Kac 
models also have several nonhomogeneous Markov interpretations (see Sec- 
tion 2.5.2). The essential difference between the corresponding McKean 
models and traditional homogeneous Markov chains is that their elemen- 
tary transitions depend on the distribution flow of the random states. In 
this perspective, one can ask the following question: 

Is it possible to build a Feynman-Kac model admitting a given distribu- 
tion as an invariant measure? 

If we restrict this question to the class of Feynman-Kac models with con- 
stant potential functions, then the question above is equivalent to that 
of finding a Markov model having a given invariant measure. This is of 
course the traditional central question in Monte Carlo Markov chmn lit- 
erature. There have been considerable efforts during the past decades to 
answer this important question. Several algorithms have been proposed, 
including the popular Metropolis model and the Gibbs sampler. We refer 
the reader for instance to the book by D.J. Spielgelhalter, W.R. Gilks and 
S. Richardson, [291] and to the pioneering articles of W.K. Hastings [178] 
and N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, E. Teller and A.H. 
Teller [249]. If we extend the question to nonlinear (or nonhomogeneous) 
Markov models, then one expects to obtain new Monte Carlo simulation 
methods that work. 

Of course, in contrast to traditional Markov chain methods these nonlin- 
ear models cannot be sampled perfectly, and another level of approximation 
is needed. In this connection, the particle methods discussed in this book 
provide a natural and successful strategy to produce approximate samples 
according to these McKean models. In the present section, we will not dis- 
cuss the performance of these particle numerical schemes. Their asymptotic 
behavior as the size of the systems increases will be discussed in full detail 
in the further development of Chapters 7 to 10. We also refer the reader to 
the discussion on particle approximation measures provided in Section 3.5. 
Here we concentrate our discussion on the modeling of Feynman-Kac semi- 
groups admitting a given distribution as an invariant measure. 

This section is organized as follows. In a preliminary section. Section 5.2, 
we discuss the existence and uniqueness of invariant measures. We connect 
this question with the contraction analysis developed in Section 4.3. We 
provide simple suflScient conditions on the pair (G,M) under which the 
corresponding Feynman-Kac semigroup has a unique invariant measure. 
We will also transfer the contraction estimates provided in earlier sections 
to quantify the decays to equilibrium of the corresponding McKean models. 




5.1 Introduction 159 



In Section 5.3, we design an original strategy to build Feynman-Kac mod- 
els admitting a given distribution as an invariant measure. These models 
are related to a judicious choice of Radon-Nikodym and Metropolis type 
potential functions. The Feynman-Kac modeling technique presented in 
this section gives natural powerful tools for developing new particle simu- 
lation methods. Because of their importance in practice, we have devoted 
a separate section. Section 5.5, to these Feynman-Kac-Metropolis models. 
We already mentioned that these nonlinear models have better decays to 
equilibrium than traditional Monte Carlo Markov chain algorithms. Fur- 
thermore, they are not only useful in drawing samples according to a given 
target distribution. They also induce new genealogical particle simulation 
methods for drawing samples according to the law of restricted Markov 
chains with respect to their terminal values. 

In the further development of this chapter, we will use the same termi- 
nology as was used in Section 2.7 for abstract Feynman-Kac semigroups. 
However, in the present homogeneous situation, we have chosen to clarify 
the presentation, and we adopt a slightly more simple system of notation. 
Next we discuss these simplifications and take this opportunity to fix some 
simple but generic properties of invariemt measures. 

When there is no possible^onfusion, we suppress the time index and we 
write (E,G,M) and instead of {En,Gn,Mn) and 

To avoid repetition, we denote respectively by / and n a test function in 
Bb{E) and a probability measure on E. We also use the letter x to denote 
a point in E. ^ 

We recall that the one-step mappings $ and $ : V{E) -¥ V{E) are 
defined by the formulae 

$(/x) = and $(/r) = (5.1) 

where : 'P{E) -¥ V{E) is the Boltzmann-Gibbs transformation associated 
with a boimded nonnegative potential function G on E. That is, we have 
that 

^in){dx) = G{x) ^l{dx) (5.2) 

Unless otherwise stated, we will assume that G is strictly positive so that 
$ is well-defined on the whole set of distributions on E. 

Definition 5.1.1 Given a mapping 0 : V{E) V{E), a measure p 6 V{E) 
is said to he Q-invariant if p — 0(m)- When 0 = $ (orQ = ^), sometimes 
we say that p is ^-invariant/ {G, M) (or ^-invariant/ {G, M) ) to emphasize 
for which pair {G, M) the measure p is ^-invariant. 

We end this preliminary section with a simple observation. 

Proposition 5.1.1 For any potential function G : £ [0, oo) and for 

any distribution p € V{E) such that p(G) A pM{G) > 0, the following 




160 5. Invariant Measures and Related Topics 



assertions are satisfied. 

(i is ^-invariant/ (G,M) ^ '9 (ji) is ^-invariant/ {G,M) 

H is 9 -invariant/ {G,M) => fiM is 9-invariant/ (G,M) 

Proof: 

Let us assume that n is $-invariant/(G, M). In this case, we have 
9{m) = 9{9{ti)M) = 9{9{n)) = 9{fi) 

and the fimt assertion is proved. To check the second implication, we assume 
that fi is $-invariant/(G, M ). In this situation, we observe that 

9{nM) — 9{fiM)M = 9{fi)M = /xM 

This clearly ends the proof of the second assertion. ■ 



5.2 Existence and Uniqueness 



In this short section, we apply the contraction analysis developed in Sec- 
tion 4,3.3 and Section 4.4 to study the existence and the imiqueness of 
invariant measures. Before presenting the main theorem of this section, we 
give a brief discussion on some interesting consequences of the existence of 
these measures. 

Suppose r) = $(r/) 6 V{E) is a fixed point of $ and let 7n and rjn be the 
unnormalized and the normalized Feynman-Kac model starting at 70 = r/ 
and Tjo = r/. By construction, we have 

/ n-l 

Vn = 9yv) = V and 7n(/) = E„ /W J] 

\ p=0 



where £,,(.) is the expectation with respect to the law of a homogeneous 
Marlmv chain Xn with transitions M and initial distribution r/. In view of 
Proposition 2.3.1, we find that 



E, 




= »?(/) nicr 



In particular, if we take constant test functions, we have proved that 



-logE, 




= log» 7 (G) 




5.3 Invariant Measures and Feynman-Kac Modeling 161 



The precise physical interpretation of this formula is given in Section 12.4. 
We have seen that log(fj(G)) represents the logarithmic Lyapimov expo- 
nent of the semigroup Q{f) = G.M{f) on the Banach space Bt{E). In 
this connection and whenever G is a [0, l]-valued potential, the quantity 
? 7 (G)” € [0, 1] represents the probability that an absorbed particle motion 
is still alive at time n (see Section 2.5.1). The following theorem, which 
is a direct consequence of Proposition 4.3.5 and Proposition 4.4.3, is often 
useful in applications. 

Theorem 5.2.1 Suppose conditions (G) and {M)m are met for some in- 
teger parameter m > 1 and some numbers e(G) > 0 and e(M) > 0. Then 
there exists a unique invariant measure r\ = $(tj) e V{E) and for any 
n e N tue have 



where £,,(.) is the expectation with respect to the law of a homogeneous 
Markov chain X„ with transitions M and initial distribution tj. Further- 
more, if we denote by H the total variation distance on the set V{E) or any 
h' -divergence H satisfying the growth condition {H)a, then for any n>m 
we have the estimate 

for some finite constant c\ < oo whose values only depend on e{M) and 
c^(G). In addition, fj = $( 77 ) G V{E) is the unique ^-invariant measure 
and for any n>m we have the estimate 

H($”(/i),^)<C2 (l-e2(M)e— 

for some finite constant C 2 < 00 whose values only depend on e{M) and 
particular, when {M)m is satisfied for m = I, we have a 
uniform estimate 

vnth a finite constant c < oo that only depends on e(M). 



5.3 Invariant Measures and Feynman-Kac 
Modeling 

In this section, we design a natural strategy to construct a Feynman-Kac 
model admitting a given distribution as an invariant measure. To describe 




162 5. Invariant Measures and Related Topics 



this method precisely, it is convenient to introduce another round of nota- 
tion. One key idea is to enlarge the state space. We suppose E = {S x S) 
is the twofold product of a given measurable space (5, 5) and we denote 
by K{y, dz) a given Markov kernel on S. We associate with K the Markov 
kernel on E defined in the synthetic integral form 

M^i{y,z),d{y',z')) = 6,{dy') K{y’,dz') 

In other words, if (yn)n>o is the 5-valued Markov chain with elementary 
transition K, then is the Markov transition of the Markov chain de- 
fined by 

x„ = (y„,y„+i)€iE; = 5x5 

Finally, we associate with any ir € V{S) and with any Markov kernel K on 
S the distributions (tt x and (tt x K )2 on E defined by 

(7TX/if)i(d(y,j/0) = ir{dy) K{y,dy') 

(ir X K) 2 {d{y,y')) = K{dy') K{y\dy) 

Sometimes we simplify notation and we write x K instead of (tt x K)i. 
The following proposition is pivotal. 

Proposition 5.3.1 Let M be a Markov kernel on E. For any pair of dis- 
tributions fi,T} € P{E), we have 



/i <C 7/ and fiM = rj 




is ^-invariant/ 
is ^-invariant/ \^,M 



In particular, for any tt € P(5) and any pair of Markov kernels {K, L) on 
S, we have 



(tt X 1)2 < (tt X K)i =► 



(tt X K)i is ^-invariant/ 
(tt X L )2 is ^-invariant/ 



Proof: 

Suppose n<rj and pM = r} for some Markov transition M on E. Let '5' be 
the Boltzmann-Gibbs transformation associated with the Radon-Nikodym 
potential G = ^. By construction, we have 

'9{rj) = p and $(»/) = ^{rj)M = pM = rj 

and in the same way $(/x) = 9{pM) = = p. This shows that p and t] 

satisfy the desired invariance property, and the proof of the first assertion 
is completed. The second result is a direct consequence of the preceding 




5.3 Invariant Measures and Feynman-Kac Modeling 163 



one. FVom the latter, it suflSces to check that (tt x L ) 2M ^ = (tt x K ) i , A 
simple calculation shows that 

{■K X L)2M^{d{y,y')) = j n{dz') x L{z',dz) Sz'{dy) K{y,dy') 

= J ^{dy) K{y,dy') = (tt x K)i{d{y,dy')) 

This completes the proof of the proposition. ■ 



Such arguments are powerful tools to construct Feynman-Kac models 
having a prescribed invariant measme. Let us give a couple of examples to 
illustrate this assertion. Let tt be a given distribution on S and let {K, L) 
be a pair of Markov kernels such that L{y, .) < tt < K{y, V 

FVom the preceding proposition the measures (tt x K)i and (tt x L)2 are 
respectively $-and ^-invariant with respect to the pair potentieil/kernel 

One important target distribution arising in practice is the Boltzmann- 
Gibbs measure associated with a given nonnegative energy function V on 
S. These distributions have the form 

’r(dy) = ''^dy) (5.3) 

where v € M+{S) is a reference measure such that v{e~^) > 0. For any 
p6ur of Markov kernels {K,L) such that {u x L )2 < ( 1 / x K)\, we have 
(jr X L )2 (tt X K)\. Using Proposition 5.3.1, we conclude that (tt x K)\ 
is ^-invariant with respect to the pair 

In this situation, we notice that 

Suppose the measure v is reversible with respect to /f. If we choose K = L, 
the corresponding potential function takes the form 



G(y,yO = e-(’^(v')-v(„)) 




164 5. Invariant Measures and Related Topics 



5.4 Feynman-Kac and Metropolis-Hastings Models 

The pair potential/kemel {G,M) = also arises in the 

construction of the Metropolis-Hastings Markov chain. This model is very 
popular in Monte Carlo Markov chain literature mainly because it provides 
a kind of universal strategy to construct a homogeneous Markov chain with 
target limiting distribution tt. To underline the similarities and the differ- 
ences between these two models, we end this section around this theme. 

The Feynman-Kac and Metropolis-Hastings models associated with the 
pair {G, M^) correspond to two different ways to associate with the Radon- 
Nikod)rm ratio G a Markov kernel on the transition phase E = S^: 

1. One of the central objects in the construction of the Feynman-Kac 
model is the Boltzmann-Gibbs transformation 'I' associated with G. 
We recall that for any tj e V{E) the measure is given by 

G{y,y') q{d{y,y')) 

There exist many different ways to connect the distribution 9{q) 
with a Markov kernel. These different choices correspond to different 
McKean interpretations of the Feynman-Kac model. We refer the 
reader to Section 2.5.2 for a precise definition of a McKean model 
(see also Section 2.5.3 for a collection of examples). We can choose 
for instance the decomposition 

^(»?) = vS,, 

where S^, q € P{E), is the collection of Markov transitions on E 
defined by 

SM), .) = e G{y,y’) + (1 - c G{y,y')) $(»/)(.) 

The parameter c > 0 is chosen so that eG < 1. Under a suitably 
defined McKean measure, the nonlinear recursive equation 

Vn+l — VnStin^^ (5.4) 

can be interpreted as the evolution of the laws of a nonhomogeneous 
Markov chain on the transition phase E = {S x S) with elementary 
transitions Notice that the random evolution of 

the chain is decomposed into two separate selection/mutation mechar 
nisms. The selection transition is intended to favor phase regions 
with high Metropolis ratio G, while the mutation transition consists 
in exploring the transition phase according to 

2. The Metropolis-Hastings model associated with the pair (G,M^) 
is again based on two separate transitions. The first one consists 




5.4 Feynman-Kac and Metropolis-Hastings Models 165 



in exploring the phase space of all transitions E = according 
to the Markov kernel M^. The second transition S is an accep- 
tance/rejection mechanism on E, It is defined in terms of G by the 
expression 

S((j,,y'), .) = (1 AG(y,y')) W)(0 + (1 " (1 AG(y,y'))) ^(»,y)(0 

It is instructive to examine the “advantages and drawbacks” of these two 
models. 

One advantage of the Metropolis-Hastings model is that its semigroup 
structure is homogeneous and linear so that the algorithm can be sampled 
perfectly. One drawback is that the acceptance/rejection transition S de- 
scribed above tends to slow down the convergence to equilibrium of the 
chain. For instance, for the Boltzmann-Gibbs limiting distribution (5.3) 
and in the reversible situation, the transition S has the form 

m. »').-) = + (1 - 

In this situation, the rejection probability is close to one in transition phase 
regions where the difference (V’(y') - V(y)) is high. On the other hand, we 
recall that each pair (y,yO has to be interpreted as an elementary tran- 
sition (y j/) with distribution K{y,dy'). In practice, K{y,dy') is often 
a distribution on some local neighborhood of y. These two observations 
indicate that the algorithm may be trapped for a long period of time in the 
neighborhood of some local minimum of the energy function V^. 

At first sight, one drawback of the Feynman-Kac model is that its semi- 
group structure is nonlinear and nonhomogeneous so that it cannot be 
sampled directly. Indeed, to sample random transitions of this chain, we 
need to compute the solution to the nonlinear equation (5.4). Nevertheless, 
one advantage of the nonlinear selection transition is that the resulting 
Markov model is not slowed down by a rejection mechanism. Furthermore, 
using any particle interpretation of the Feynman-Kac model (5.4), this 
nonlinearity is turned into an interaction mechanism between the particle 
transitions. In this way, the drawbacks discussed above are turned into a 
natural and advantageous way to define a sequence of interacting Metropo- 
lis type models. In contrast to the classical Metropolis model, a rejected 
transition is here instantly replaced by a better-fitted one randomly cho- 
sen in the current particle transition configuration. From this discussion, 
it is intuitively clear that the Feynman-Kac model is not slowed down by 
a rejection stage and it should have better asymptotic properties. More 
interestingly, the law of the states of this nonhomogeneous Markov model 
has a nice explicit description in terms of the Feynman-Kac formulae 

/ n-l 

»/«(/)= 7n(/)/7n(l) with 7n(/) = E^^ /(X„)nG(Xp) 

\ p=o 




166 5. Invariant Measures and Related Topics 



where Ej5^(.) is the expectation with respect to the law of a homoge- 
neous Markov chain Xn with transitions and initial distribution r/o- In 
addition, r)n is the nth time mtirginal of the Feynman-Kac path measure 

Qf)o,n(d(xo,...,x„)) = ;^ | R G(Xp) | P^,„(d(lo, . . . ,X„)) 

where P^ „ is the distribution of the path (Xq, . . . , Xn) 

P^ „(d(xo, . . . ,x„)) = %(d(xo)) M^(xo,dxi) . . . M^(x„_i,dx„) 

We recall that the measures r)n and Q^oin i^he limiting measures of the 
“marginal” and “genealogical” particle approximation models associated 
with the flow (5.4) (see Chapter 3, Section 11.2, Section 11.4, and the end 
of Section 5.5.1). On the other hand if we start with the initial distribution 
Tjo = (<Jy X K)i € ViE) for some y G 5, we prove that 

Q(«,xK).,n=ltxiCh(((>'n+l,yn),...,(ri,ro))€. |y^ (5.5) 

where P^xK)a ^ homogeneous Markov chain Xn = (y», In+i) 

with transitions and initial distribution (tt x K) 2 - We also notice that 

Pf,xif)a((5"n+l,...,n)e.|yn+l=y) 

= Pf.xD^ {(Xn , . . . , yo) G . I y„ = y) = P^liiYn, ...,Yo)€.\Yn=y) 

where stands for the law of an 5-valued M^kov chain Yn with initial 
distribution tt and transitions L. Finally, we observe that for suflSciently 
regular transitions L we expect that in some sense 

Vn = Pf,xiOa((n.I^) e . I yn+1 = V) ^ (^ X K), 

n-^ 00 

These results show that the particle model is not only designed to sample 
according to tt, but its genealogical tree also allows us to produce approxi- 
mate samples according to the law of a restricted Markov chain with respect 
to its terminal values. The precise anal}rsis of the preceding assertions is 
out of the scope of this section. Because of its importance in practice, we 
devote a separate section. Section 5.5, to the full analysis of these models. 

5.5 Feynman-Kac-Metropolis Models 

5. 5. 1 Introduction 

In this section, we use the same notation and the same terminology as in 
Section 5.3. We recall that K{y,dy!) and L{y,dy') are two given Markov 




5.5 Feymnan-Kac-Metropolis Models 167 



kernels on a measurable space (5, 5) and M ^ is the Markov kernel on the 
transition phase E = defined by 

M^i{y,z),d{y',z')) = 5,{dy') K{y',dz') 

We further assume for simplicity that ir G V{S) and the pair kernels {K, L) 
are chosen so that 

(tt X L) 2 {d{y,y')) = Tr(dp') L{y',dy) <{irx K)i{d{y,y')) = ir{dy) K{y,dy') 
and the Radon-Nikodym potential 

{y,y')€E = {SxS) ^ G{y,y') = 

is a strictly positive and bounded function on the product space E = S^. 

In reference to the discussion given in Section 5.4, we adopt the following 
terminology. 

Definition 5.5.1 The Feynman-Kac model associated with the pair poten- 
tial/kemel 

is called the Feynman-Kac-Metropolis model (associated with (G,M)). 

To compare the ways the Radon-Nikodym potential enters into the Feynman- 
Kac-Metropolis particle approximation models or the Metropolis-Hastings 
Markov model, we provide hereafter a brief description of these two algo- 
rithms. 

The Metropolis-Hastings model is a homogeneous Markov chain 

Zn = (YnX)^E = S^ 

with a two-step selection/mutation transition SM^. By construction, the 
first component Vj, is an 5-valued and homogeneous Markov chain. Its 
elementary transitions are given by the familiar expression 

K{y, dy') = (lAG(y, y')) K{y, dy')+ (l " A G{y, z))K{y, Sy{dy') 

(5.6) 

To prove this assertion, we first note that 

E(/(y„+i) I XX)) 

= S{l®f){Yr,X) 

= (1 A GX, n)) fiY^) + (1 - (1 A GX, 1^4))) /(Yn) 




168 5. Invariant Measures and Related Topics 



Since we have E(/(l^) | y„) = ^(/)(5^) for any n > 1, we conclude that 
E(/(y„+i)|y„ = y) 

= /s(l A G{y, y')) m K{y, dy^) + (1 - /^(l A G{y, z))K{y, dz)) f{y) 

This ends the proof of the desired result. It is also well-known that if we 
can choose K = L, then tt = ttM is the invariant measure of this chadn. By 
construction, the random evolution of is decomposed into two separate 
mechanisms 

exploration selection 

y« ^ y„ ^ >^n+i 

During the exploration stage, the particle Vj, makes an elementary move 
according to the Markov transition K. In other words, the single particle 
Yn randomly chooses a new location with distribution K{Yn, .)• During 
the selection stage we accept the transition (Yn -^Yn) with a probability 
(1 A G(Yn,Y^)) and we set y„+i = 1^. Otherwise we reject the transition 
and we stay in the same location; that is, we set y,+i = Y„. 

The Feynman-Kac-Metropolis model is a nonhomogeneous Markov 
chain on the transition phase 

Zn = (YnX)^E = S^ 

with a two-step selection/mutation transition A^n,» 7 n-i = The 

sequence of distributions rjn G P{E) is the solution of the nonlinear recin:- 
sive equation 

Vn = (= ^{rjn-i)M^) (5.7) 

where 

• 5,, € V{E), is the collection of Markov transitions 

Sr,{{y,y % .) = e G(y,y') <5(,y)(.) -K1 - e G(y,y')) ^(t?)(.) 

• : V{E) -¥ V(E) is the Boltzmann-Gibbs transformation associated 
with Metropolis potential function G, and it is defined by 

^(»?)(d(y.y')) = ^ G(y,y') T/(d(y,y')) 

We recall that the distribution flow r)n is alternatively defined in terms of 
the Feynman-Kac formulae 



»?n(/) = 7n(/)/7n(l) 



with 7n(/) = 



=E^(nxSli 

\ p=0 



G(Xp) 




5.5 Feynman-Kac-Metropolis Models 169 



where E^(.) is the expectation with respect to the law of a homoge- 
neous Markov chain with transitions and initial distribution r^o- 
Under a suitably chosen McKean measure, the distribution rjn represents 
the law of the random state of the chain Z„ = {¥„,¥„) at each time n 
(see Section 2.5.2 for a precise construction of these McKean measures). 
Arguing as before, we see that the random evolution is again decomposed 
into two separate mechanisms 

exploration , selection 
^ ¥’ ^ 

The exploration stage coincides with that of the Metropolis model but the 
selection stage is different. Here we accept the transition {¥„ -> ¥^) with 
a probability eG{¥n, Kg), and we set = 1^. Otherwise, we select ran- 
domly a new one {Yn, V'n) according to 't(Tjn-i), and we set y„+i = ¥!^. As 
we mentioned above, this nonhomogeneous Markov model cannot be sam- 
pled perfectly mainly because the distributions jj„_i and a fortiori 
are generally unknown. 



The interacting Metropolis model is the AT-particle approximation 
model 

C = (CK‘)g£;=(5x5) 

associated with the McKean interpretation (5.7) of the Feynman-Kac dis- 
tribution flow. We refer the reader to Section 2.5.2 for a precise descrip- 
tion of these particle interpretations. For the convenience of the reader 
and to better connect this particle simulation method with the preceding 
Metropolis-Hastings model, we provide hereafter a brief presentation of this 
model. Suppose the initial distribution is given by 



r/o = 5y X K 

with an arbitrary point y € 5. In this situation, the initial system is given 
by ^0 = where are independent and identically 

distributed random variables with common law if(y, .). The iV-particle 
model associated with the McKean kernels Sr^M^ is a mutation/selection 
algorithm 






mutation 



{yn)x<i<N 



selection , 

^ (^n+l)l<t<N 



During the mutation stage, each particle K,) evolves randomly and indepen- 
dently according to the Markov kernel K to some new locations ¥^. These 
new locations ¥^' are accepted or rejected according to a mechanism that 
depends on the set of sampled transitions {¥^,¥^^), 1 < j < N. With a 
probability eG{¥^, accept the tth state ¥^' and we set y,}^i = ¥J^. 

Otherwise, we select randomly a state ¥^^ with distribution 



V G{¥i,¥;i) 




170 5. Invariant Measures and Related Topics 



and we set Loosely speaking the selection transition is intended 

to improve the quality of the configuration by allocating more reproductive 
opportimities to pair particles with higher Metropolis ratio. Rrom 

the preceding description, this iVT-particle model can clearly be interpreted 
as a sequence of N interacting Metropolis algorithms. 

The choice of the McKean selection transition is not unique (see Sec- 
tionmckean). In practice, it is desirable to choose a McKean interpretation 
which the highest acceptance probability. In this coimection, if one chooses 
the selection model (2.27) then the acceptance probability eG{Y^,Y^) is 
replaced by G{YM/Vj GiYlY^.^). 

We finally note that each equivalent formulation of a given Feynman- 
Kac measure induces a different interacting Metropolis type model. For 
instance, we have for any Xq € and any /„ € 



n-l 



K 



/„(Xo, . . . , x„) n G{X,) = I /n(Xo, . . . , X„) U G{X,) 

K P=1 / V P=0 t 



where G = M^{G) and E^(*) is the expectation with respect to the law 
of a homogeneous Markov chain with transitions 



M^(x, dx') 



M^{x,dx')G{x') 

M^(G)(x) 



For instance, for the Boltzmann-Gibbs distribution (5.3), and in the re- 
versible situation, the pair potential/kemel (G, M^) takes the form 

G{y,y') = K{e~^){y') 



and M^{{y,i/),d{z,z')) = Sy>(dz) K(z,dz') with 
K(z.dz)= ■ 

The corresponding ^-particle model is again a two-step Markov chain 

mutation selection 

^ (ynh<i<i^ ^ 

with a selection/mutation procedure defined as above by replacing the pair 
(G,M^)bythe(G,M^). 



5.5.2 The Genealogical Metropolis Particle Model 

Another important feature of the evolutionary particle scheme described in 
the introductory section concerns its birth and death interpretation: The 
individual Y^^_i selected by the tth individual YJj can be seen as the parent 




5.5 Feynman-Kac-Metropolis Models 171 



of y^. Recalling that has itself been sampled according to .), 

we can interpret Y^_^ as the ancestor of Y* at level (n~ 1). Running 

this construction back in time, we can trace back the complete ancestral 
line of each current individual Y* ^ = Y^ 



^0,n 



^l.n 



Y' 

^ n- 



l,n 



y* 

n,n 



The study of the genealogical structure of the interacting Metropolis model 
(starting at y € 5) is not of pure mathematical interest. This object is 
in fact a powerful particle simulation method for drawing path samples 
of restricted Markov models with respect to their terminal value. More 
precisely, we have 



1 ^ 

limiv_,oo ■ • • ,!/o)) 

i=l 



= P^^iiYn, T„-1, . . . , To) € d(y„, . . . , yo) I >; = y) (5.8) 

where represents the distribution of a time-homogeneous Markov chain 
Yn with initial distribution n and elementary transition L. For a precise 
meaning of this convergence result, we refer the reader to Chapter 7. The 
preceding convergence result indicates that the Feynman-Kac-Metropolis 
model is related to a time reversal of a Markov chain. To prove that the 
limiting distribution is precisely the one given above, we need to analyze 
the Feynman-Kac model on path space. More precisely, let 
be the ./V-particle model associated with the McKean interpretation of the 
equation (5.7). Using the same line of reasoning as above, we cam interpret 
this particle model as a birth and death process and we can trace back the 
complete ancestral line 



^o,n ^l,n 






l,n 






of each pair individual = (y^JiT,!’)- From the results presented in Sec- 
tion 3.4, we know that this path-particle model can also be regarded as 
the iV-particle model associated with the Feynman-Kac distribution flow 
on path space. More precisely, the occupation measures of the A/^-particle 
genealogical model 



t=l 



) 



converge as the size of the systems N tends to oo to the path measures 
Q,»,„ G V(Er+^) defined by 



1 

7n(l) 




Q,^,n(d(xo,...,a;„)) = 



■ • • > ®n)) 




172 5. Invariant Measures and Related Topics 



where „ is the distribution of the path from zero to time n of an E- 
valued Markov chain with initial distribution rjo and Markov transition 
M^. That is, we have that 

P^„(d(a;o,---.®n)) = J?o(da;o) M^{xo,dxi) ... M^(x„_i,dx„) 

If we take a test function /„ 6 Bb{{S x S)”"*"') of the form 

fn{{yo, yo). • • • . (yn, y|,)) = ¥’n(yo, • • • , yn) 
for some <pn € then we find that 



C.n(/n) = ^Ev^nW.„,y^.n,---,>';n) > Quo,n(/n) (5.9) 

^ i=i N-^00 



with 






/ n-l 

Ef <Pn{Yo,Yu...,Yn) nG(rp,yp+x) 
\ p^o 

H ( n G(Tp,yp+i) j 



In the formula displayed above, E^(.) represents the expectation with 
respect to the law of a time-homogeneous Markov chain Yn starting at 
Yo = y with elementary transition K, 

To prove that the Feynman-Kac path measure in the previous display 
coincides with the conditional distribution (5.8), we need to analyze more 
deeply the structure of these path measures. The complete proof of this 
result will be provided in the next section. 

Using the same line of argument, the conditional distribution (5.8) can 
be approximated using the ancestral lines (V^*n)p<n of fh® particle model 
associated to the pair (G, M^) defined in the end of section 5.5.1, and 
starting at (5^0, ^ 0 ) = (yoiyi)- That is, we have 



1 ^ 
t=l 



= p^:((yn,yn-i,...,yo)€.|y„ = yi) 



5.5.3 Path Space Models and Restricted Markov Chains 

In this section, we connect the Feynman-Kac path measures associated 
with the pair potential/kemel 



d(7T X L )2 



K 




5.5 Feymnan-Kac-Metropolis Models 173 

with the distributions of a restricted Markov chain with respect to its ter- 
minal values. Without further mention, we suppose in this section that the 
triplet (it, K, L) is chosen such that 

• The Radon-Nikodym potential G is a strictly positive and bounded 
function. 

• The measures ir and irL are mutually absolutely continuous, and for 
any n G N the Radon-Nikod)Tn derivative is a bounded and 
strictly positive function on S. 

Our first task is to describe more rigorously the Feynman-Kac path mea- 
sures on a canonical space. Let /i € ViS^) and let if be a given Markov 
kernel on S. We denote by 

the canonical Markov chain 



x„ = (y„,rn+i)G£^ = (5x5) (5.10) 

with initial distribution fi and transition M^. More rigorously, we should 
have set X„ = {Yn,Y^) instead of (5.10), but by definition of the Markov 
transition we have and therefore Xn = (y», In+i) as soon 

as n > 1. To simplify the presentation, it seems more appropriate to use 
the notation (5.10). In this obvious abusive notation, we have for instance 
that 

FjfiiYo,...,Yn)€d{yo,...,yn)) 

= fi{d{yo,dyi)) K{yi,dy 2 ) ... A'(yn-i,dyn) 

As usual we denote by Ejf(.) the expectation with respect to the law 
Pjf . When fi = S^x,y) is concentrated at a single point, we simplify notation 
and we write (Pf.yj.E^.y)) instead of ^^). We finaUy recall 

that the Feynman-Kac path measure Qf^,n € associated with the 

pair {G, M^) and its terminal time marginal distributions r)n € V{E) are 
defined by 

Q.Jo,n(d(®0, . • • , Xn)) = I n I , ®n)) 



/ n-1 

»ln(/)=7n(/)/7n(l) with 7n(/) = E^^ /(X„) J] W 

V P=0 



and 




174 5. Invariant Measures and Related Topics 

By the definition of G, we readily obtain the time reversal formula 

n 

n{dyo)K{yo, dyi).. .K{y„, dy„+i) JJ G{yp, yp+i) 

p=0 



= 7r(dy„+i)L(j/„+i, dyi) . . . L{yi,dyo) 



In other words we have 

<xL((Wyn,---,yo)€.) 



n 

(yo) • • • ) yn+i) = G{yp, yp+i) 

p=0 



and we find the following pivotal lemma. 

Lemma 5.5.1 (time reversal formula) For any n > 0 and any (pn € 
we have 

E^;,Myn+uYn,...,Yo)) 

= {MYo, Fi, . . . , y„+i) np=o G{Yp, Fp+i)) 

In the next theorem, we provide another couple of time reversal formulae. 
They allow us to interpret the multiplicative structure of the Feynman-Kac 
model as a combination of a change of probability transitions with a time 
reversal of a Markov chain. 

Theorem 5.5.1 For any ip„ e and n € N, we have the following 

Feynman-Kac formulae: 






^„(yo,...,yn+i) nG(yp,yp+i) 



p=l 



= ^{yi) (¥’n(yn+i,...,yo) I {Yn+M = (yo,yi)) (5.11) 

and 



n— 1 






,,y„+i) nG(yp,yp+i) 



p=l 



dn 



(yi) Ef,^^^^MYr,+r,...,Yo) I (y„+i,y„) = (yo.yi)) 

(5.12) 



Proof: 

First we use the time reversal formula presented in Lemma 5.5.1 to check 




5.5 Feynman-Kac-Metropolis Models 175 



that for any v? € Bb{E) 

np=iG(yp,rp+i)) 

= E^^MYo,Yi) G{Yo,Yi)-^ rip=o<?(>^P> Vi)) 

= E^:^^Mr„+i,yn) G(yn+1, >;)-*) 

= J (yn)7r(d2/n+l)*^(2/n+li ^yn)V^(2/n+li 2/n) 

This yields that 

n^(yp,yp+i)j =Ei^,^ (¥>(>^0.^1) ^m)) 

Since this formula holds true for any </? G Bh{E)^ we conclude that for any 
(i/o,i/i) e E 

(f[G(y„ v.)j = ^(».) (5.13) 

Prom Lenuna 5.5.1, we find that for any (/?' G 06 (£^) 

EixL(V’'(>;.+l. Yn) ¥’n(rn+l, . • • , ^ 0 )) 



= Ei^^i,{,p'{Yo, Yr) ipniYo , . . . , y„+i) UU G(Yp, Yp+i)) 

= E^-,^(^'(yo,n) Efy„,y^)[v^„(yo,...,yn+i)np=oG(yp,yp+i)l ) 
Using (5.13), we also prove that 

Ei'xL(¥’'(i"n+l, y.) ¥’n(y.+l, . . . , yo)) 



Ei^xir 



n 



nG(yp,yp+i) 

4>=0 



dn 

dirL^ 



(Yi) E(^„, 



yi) 



v?„(yo,...,y„+i)nG(yp,yp+i) 



p=l 



Ej^xK (¥’'(yo,yi) nG(yp,yp+i) 
V P=o 



diT 






t>n(n K.+i)nii(Vi'p+o 

P=1 




176 



5. Invariant Measures and Related Topics 



Then, by Lemma 5.5.1, we get 

Ei'xL(¥’'(i^n+lTn) <Pn(Yn+l,...,Yo)) 

Since this formula is valid for any (p\ we conclude that for any (yo> Vi) ^ E 

E(^o.y.)('^n(yo, .... yn+i) np=i g { y ^, yp+o) 

= ^(yi) E^^MYn+u...,Yo) I (y„+i,y„) = (»o,yi)) 

This ends the proof of the first assertion of the proposition. To prove the 
second formula, we first observe that 

(*>"(»'"*■ 

= E, y.) ^^(K-.y.)) 

By Lemma 5.5.1, we find that 
Ef,xKh(V’n(yn+l,...,yo)) 

^v^„(yo,.-,yn+i) ^?(yp, yp+i)^ 

(v>„(yo,...,yn+i) G(y„,y„+,)-i np=oG(yp,yp+i)) 

and therefore 

^(kxK)3 (^n(5^+l) • • • > yb)) 

= E^xK). (<Pn(Yo,...,Y„+i) (5-14) 

ftom this formula, we prove that for any (p' € Bb(E) 

EfxxK), (<P'(Yn+M <p„(Y„+i , .... Yo)) 

=Ef,xich (v’'(yo.yi) y>n(yo,...,yn+i) np:oG(yp,yp+i)) 

= E^x/O. ('P'iYo, Yt) (v>„(Yo , .... yn+i) UU Vi))) 




5.5 Eeynman-Kac-Metropolis Models 177 



On the other hand, by (5.13) we have for any (j/o, yi) 6 E and n > 1 



This yields that 

r„) V^„(T„+i, . . . , Yo)) 

fn-1 



= L(Yo,'. 



nc(>i,r,*i) 

p=0 



"" ^(nxK) 



/ n-l 

, (<p'{Yo,Yr) n<?(yp,Kp+i) 
V P=o 






n-l \ 

nG(rp,yp+i) 

P=1 J / 



dn 






n-l 



^n(l'0,...,Kn+l) nG'(^P>Vl) 

P=1 



) 



Finally, using (5.14), we arrive at 

Ef,xK), yn) MYn+1, • • • , ^o)) 

= Efirx/f)j (v’'(5'n+l,in) 

^ rip=l ) 

Since this formula holds true for any <//, we conclude that for any pmr 
(ito.l/i) G we have 

• • • > n;:i' g { y ^, rp+i)) 

= Eorxifh (V’n(K„+l,...,yo) I (Kn+lTn) = (l«.,yi)) 



This ends the proof of the theorem. 




178 



5. Invariant Measures and Related Topics 



Corollary 5.5.1 For any (pn G n e N, and y e S, we have the 

Feynman-Kac formulae 



vK 

^{SyXKh 






¥>„(yo....,rn+i) llG{Yp,Yp+l) 

p=0 



) 



d7TL"+^ 

dit 



(y) ®firxl,), 






(5.15) 



and 



^^(SyXK)r 



n— 1 





p=0 } 



^(y) ^(.xKhMYn+i , . . . ,yo) I i;.+i = y) 



(5.16) 



Proof: 

The proof of (5.15) is based on (5.11). Indeed, in view of (5.11), we have 
for any (yoiVi) e 5* 

E(^.xK)» (V>n(yo, • . • , ^n+i) nU GiYp^ Yp+i)) 

= E(^.V) (V>n(Vl, . . . , ^n+2) nS Vl)) 

rn+1 

= —^(y) Ef,,i).(v^„(y„+i,...,ro) I Vn+i = yi) 

This ends the proof of the first assertion. In much the same way, we prove 
(5.16) using (5.12). We simply note that for any (yo, yi) € 

Eg,x/0. (¥>n(>^, . . . , Vn+l) UU ^P+l)) 

= • • • > ^”+2) np=i G(yp, yp+i)) 

= ^(y) Ef,,^),(v>„(yn+l,...To) I ^n +1 =yi) 

This ends the proof of the corollary. ■ 

This corollary shows that the limiting distribution of the genealogical par- 
ticle model presented in (5.5), (5.8), and (5.9) coincides with the desired 
conditional distribution. More precisely, using (5.15), we conclude that for 




5.5 Feynman-Kac-Metropolis Models 



179 



any ¥>« e 06(5”+^) 



(¥>n(lo,n,...,rn) nG'(^p- 5 "p+i) 
_V P=0 J_ 



n-1 



= Ef,xL).(¥’n(>;,i;.-i,...,i'o) I Yr^^y) 
Furthermore, by (5.16), we also find that 



Q{«.xKh,n =Pf,xK),(((>;.+l.>'n),...,(yiTo)) € . | Fn+l = y) 



5.5.4 Stability Properties 

In this section, we analyze in some detail the stability properties of Feyn- 
man-Kac-Metropolis models and we improve the contraction inequalities 
presented in Theorem 5.2.1. We present a natural mixing condition under 
which the decays to equilibrium do not depend on the nature of the target 
invariant distribution. Without further mention, we assume that the triplet 
(tt, ft, L) is chosen such that the Metropolis potential function G satisfies 
the following condition: 



(G): 

have 



There exists an e{G) > 0 such that for any pair (x,x') € we 
G{x) > e{G) G(x') (> 0) 



It is instructive to observe that this condition can alternatively be written 
in terms of the pairs (tt, K) and (tt, L) as follows: for any x,x' e E = 



e{G) < G{x)/G{x') = 



d(7T X L)2 

d{n X K)i ^ ^ 



d(v X K)i 
d{ir X L )2 



(xO < 1A(G) 



Also notice that the latter is equivalent to the following inequalities 



y/^{irxK)i < {itxLh < 



1 



(tt X K)i 



Using a clear induction on the time parameter n E N, we also find that, for 
any n e N, we have irU* < tt, and for any y € S 



dirL’^ 

d-K 



(y) € (c(G)"/2,e(G)-’*/*l 



In this section, we use the same terminology as that used in Section 2.7, 
but to clarify the presentation, we simpUfy the presentation and we use the 




180 



5. Invariant Measures and Related Topics 



superscript (.)’^ ^ instead of the subscript (.)n,p to represent the (n — p) 
iterates of a semigroup (from time p to time n). For instance, we write 

p^“P, P«-p, $^-p, $^-P) 



instead of 



(Qp,m Qp,n> Ppffij ^p,n> ^p,n) 



We recall that the Feynman-Kac semigroups and associated with 
the pair (G, Af) can alternatively be defined by the expressions 






E^(/(^n) npipgw) 



mQ"(/) 



and 



$”(/x)(/) = 



E^(/(Xn) n;=iG(Xp)) 
E?(np=iG(Xp)) 



/^Q"(/) 



Let <7" and G” denote the potential functions on E defined by 

G”(x) = Q"(l)(x) and ^(x) = g"(l)(x) 

From the previous displayed formulae, we see that and can be rewrit- 
ten as 

and 8"(/x) = $^(/i)P^ (5.17) 

with^he Boltzmann-Gibbs transformations and "9^ associated with 
and G^ and defined by 



fi{dx) and = ^0^ n{dx) 

The notations (G",P**,G”,P") may be somewhat confusing. To prevent 
any kind of confusion, we emphasize that even for time-homogeneous mod- 
els tl^ pair sequence of transformations as w^ as the Markov kernels P” 
and P” are not the n iterates of ’J', P^, and P'. We also mention that the 
decompositions (5.17) coincide with the ones presented in Section 2.7 when 
replacing {Gp^ni^p,ntEp,n) (^»p,m '^p,n> -^pin) by {G^ **,P” 

and (G”-**, P"-"). 

Our immediate objective is to pr^de an “explicit” description of the 
Feynman-Kac semigroups Q” and Q” in terms of the triplet (ir,K,L). 




5.5 Feynman-Kac-Metropolis Models 181 



Using (5.12), we prove that 



/ n-l 



dnL^ 



-1 



= Giyo.yi) | Fn = yi) 



dir 



Now, by (5.11), we find that 



Q”(/)(yo,yi) = nc'(Xp)j 

nG(yp,rp+i)) 

p=l / 






dirL^ ^ 



= -^(yi) Ef,,^)^(/(yi,yo) I = yi) 

Prom these two formulae, we conclude that 



P”(/)(yo,yi) = Ef.^K)Af(yi,yo)\Yn = yi) 

P^if){yo,yi) = Ef,,^),(/(yi,Ko)|rn = yx) 

For a suflSciently regular kernel L and for n large enough, we also find the 
following “explicit” descriptions of the nonhomogeneous potential functions 

G”(yo,yi) = G{yo,yi) (J'l) ^(yo.yi) = 

(5.1^8) 

Erom previous considerations, we see that the semigroups and 
have a “nice” explicit formulation in terms of the triplet (tt, K, L). Thus, 
one expects to improve the stability results discussed in Section 4.3.3 in the 
context of abstract nonhomogeneous models. We will examine this question 
through three different angles related respectively to the oscillations of 
the semigroups and to the weak and the strong contraction properties. As 
usual, in order to unify the statements of this section, we denote by H the 
total variation distance on V{E) or any h'-divergence satisfying the growth 
condition (H)a, stated on page 135. We recall that, for these two classes 
of distance-like criteria, an is the function on R+ defined respectively by 
a/f(r) = 2r and aff(r) = ro(r) (see Definition 4.3.2). 

1. First we observe from Proposition 4.3.1 that /3(P") is a measure of 
the oscillations of the mapping with respect to the toted variation 




182 



5. Invariant Measures and Related Topics 



distance. That is, we have that 

^(P”) = sup ||$”(Ai)-$"(»/)||tv 

2. The Dobnishin ergodic coefficient also appears useful in the analysis 
of the weak regularity properties of the semigroups. Using Proposi- 
tion 4.3.7, we prove that for any (n,/,/x) € (N x Osci(£?) x V{E)) 
there exists a function € Osci(£?) such that for any r/ € V(E) 

![$'*(»;) -$"(m)1(/)| 

< 2^(P")sup,.,,(G"(x)/G"(x')) \{v - 

3. The preceding weak regularity properties can be turned into strong 
ones. FVom Theorem 4.3.1, we have 

< ;9(P") 5h (sup[G”(x)/G”(x')]) HM 

FVom the arguments given in Section 4.4, we emphasize that these three 
types of regularity properties remain valid if we replace in the preceding 
inequalities the triplet (G”, P”, $”) by the corresponding updated objects 
(G",P",$"). 

^ In these three routes, the stability properties of the semigroups and 
are expressed in terms of the reguleurity properties of Markov kernels 
(pn, pnj relative oscillations of the potential functions (G", G”). 

To estimate these properties, we introduce the following condition: 

{L)m '• There exist an integer parameter m > 1 and some e(L) > 0 such 
that for any pair (y, y') € 5* we have 

L”*(y,.) > e(L)L-(y',.) 

Under the latter, it follows from (4.12) that < (1 -c(L)). This shows 

in particular that L has a unique invariant measure i/l = viL € P(5), and 
using (4.10) we prove that 

< (l-e(L))’*||/i-i/L||tv 

F\irthermore, recalling that L(y, .) «^ tt, we also find that < tt. Another 
simplification due to this condition is that the conditional Markov kernels 
pm+n+i pm+n+i described have the integral representations 

^"“^"■^\/)(l«),yi) = j Tr{dy[)K{y[,dy'o) /(l/i>yo) 

P”'^^^Hf){yo,yi) = /’T(dyo)^(y^.dy()^^^j^(yi)/(y(,y'^ 

(5.19) 




5.5 Feynman-Kac-Metropolis Models 183 



Finally, under {L)m we also find that for any y £ S and n > m we have 
•) ^ “d for any y' £ S 

The following proposition is pivotal. 

Proposition 5.5.1 When the Markov kernel L satisfies condition (L)m, 
then we have for any n G N 

/?(pn+m+l) < 2e-\L) 0{L^) 



and for any x, x* £ E 

> e^{L) 

In addition, suppose condition (G) holds true and the pair of measures 
(tt, vl) on S are such that 

AK,.) = ^ > 0 (5-2«) 

Then we have the uniform estimate 

G”+”*+i(x) > e{G) e*(L) h{uL,ir) G"+"*+‘(x') 



The proof of Proposition 5.5.1 is based on the following technical lemma. 

Lemma 5.5.2 Let p £ V(S), and let (Mi, M 2 ) he a pair of Markov kernels 
on (5, S) such that for any y £ S we have 



M 2 (y, .) < pM\M 2 < M\{y , .) 



Then we have for any (y, y') £ 



dMiM2{y, .) . _ 



< 20{Mi) 



dMijz, .) 

z,z'€S dM2(z', .) 



(s/0 



Proof: 

We first use the decomposition 



dMiM2{y,.) ,. 
d/iMiMj ^ ^ 






to check that 



dMiM2(y,.) ,. 



< 2/9(Mi) 



sup 

»"€S 



dm£^ 

dfiMiM2 ^ ’ 




184 



5. Invariant Measures and Related Topics 



Since we also have 



dnM\M2 . > 



nMi{dy) 



dM2{y, ♦) 
dM2{y",.) 



(z)> 



dM2{u,.) 

u,u'6S dM2{u', .) 



{z) 



the end of the proof is straightforward. 



Proof of Proposition 5.5.1: The proof of the first assertion is a direct 
consequence of Lemma 5.5.2. By (5.19), we have for any / € Bb{E) with 
ll/ll < 1 and n e N 

|pm+n+l(^)(j^O,yi)-(7rxL)2(/)| 

Under {L)m, we have for any y £ S 

.) < < L"(y, .) 



/,(«)! 
j »iW) 



dnL”'+^ 
d(7rL)L”*+'‘ 



(yi) - 1 

(yi) - 1 



Applying Lemma 5.5.2 to the pair {Mi, M 2 ) = {U*,L”') and to the mea- 
sures y, £ {ir,nL}, the proof of the first assertion is clear. To check the last 
two inequalities, we use the descriptions of the nonhomogeneous potential 
functions given in (5.18). We have for any n e N, x = (yo,yi) € E, and 
x' = (yo.y'i) 6 E 



CT"+"+^(x) 



G{x) dnL”'+'' 
G{x') dit 



(yi) 



dn 

d7rLT"+n 



(y'l) 



G”+"(x) _ d7ri”‘+'‘ d 7 T , 

gm-hn(a./) ~ dTT d7rL”*+"^^^^ 



On the other hand, under {L)m we have for any (y, y') £ E 



L"'{y,.)<i'L and ^^^-r^-^{y') £ [e{L),€ ^(L)] 
This implies that 






(»') = j ^v'm ^(so 



from which we conclude that 

c2(L) h{vL,-n) < ^^^_(y)^-^^(y') < e-2(L) h~\uL,1^) 

The end of the proof of the proposition is now clear. ■ 




5.5 Feynman-Kac-Metropolis Models 185 



It is useful at this stage to make some remarks concerning Proposition 5.5.1. 
First we mention that the quantity h{vi,ir) represents the inverse of the 
Hubert projective distance between t/f, and ir. In this interpretation, condi- 
tion (5.20) means that the HUbert distance between the target distribution 
and the £>-invariant measure vl is finite. We also note that condition (G) is 
not a new condition. It is only the time-homogeneous version of the hypoth- 
esis (G) introduced in Section 4.3.2. Suppose that the target distribution n 
is a Boltzmann-Gibbs measure associated with a given nonnegative energy 
function V on 5 

where i/ is a reference nonnegative measiure such that > 0. Fur- 

ther assume that the pair of Markov kernels (AT, L) is chosen such that 
(i/ X L )2 < (i/ X K)\. In this situation, we notice that 

C(»Y) = 

If V has finite oscillations osc(V) < oo, then condition (G) is met with 
c(G) > X L)2,(i^ X /r)i) 

In addition, we find that 

h{vL,n) > h{yi,v) 

When the measure v is L-reversible and L = K, then we have v = vl and 

a c(G) > exp (-osc(V)) 

More interestingly, the mixing condition (I>)m, which aUows control of the 
Dobrushin, ergodic coefficients (y9(P"),/?(P”)), is not directly related to 
the traditional mixing condition (M)m usually made on the Markov kernel 
M = associated with the Feynman-Kac model. The former condition 
is dictated by the particular nature of the Metropolis potential functions. 
Finally, wejiote that the estimates on the Dobrushin ergodic coefficients 
(/3(P"), 0{P^)) as well as the imiform control of the relative oscillations of 
G” do not depend on condition (G). 

If we combine the preceding proposition with the three routes described 
earlier, we get three types of stability theorems. 

Theorem 5.5.2 (asymptotic stability) Suppose condition {L)m is sat- 
isfied for some m > 1. 77»en, for any n € N and {fi, t}) e V(E), we have 

||^m+n+l(^) _ $"*+n+l(,,)||t^ < 2€~^{L) 0{L^) 



and 



< 2€~\L) 0{U') 




186 



5. Invariant Measures and Related Topics 



Theorem 5.5.3 (weak contraction) Suppose condition {G) is satisfied. 
Also assume that {L)m is met for some m > 1 and the pair distribution 
{ul, 7t) is such that 

Then, for any (n,f,n) e (N x Osci(^) x V{E)) there exists a pair of 
functions (/^, fff) € Osc\{E) such that for any r\ € P{E) 

< c{lt,G,L) 0{n\{r, - p){f^)\ 

< 3(7T, L) ;3(L’*) \{f, - p){fi^^)\ 
for some finite constants c(jt, G, L) and c{ir, L) such that 

c(7T, G, L) < A/[e(G)^(L)h{yL, tt)) and c(7r, L) < 4/[c^(L)/i(i/i„ ir)] 

Theorem 5.5.4 (strong contraction) Under the assumptions of Theo- 
rem 5.5.S, the distributions (tt x K)i and {it x L )2 are the unique invariant 
measures of $ and Furthermore, we have for any n G N and any pair 
ip,ri)eV{E) 

< Ch(7T,C?,L) 0{L^) H{p,q) 

for some finite constants c/f(7r,G, L) andcniT^^L) such that 

CH{n,G,L) < 2e-\L) aH{{€{Gy{L)h{uL,ir))-^) 
ch(it,L) < 2e~^{L) dn {{e^{L)h{vL,ir))~^) 




6 

Annealing Properties 



6.1 Introduction 

This chapter is concerned with the long time behavior of a Feynman-Kac 
model associated with a potential function and a cooling schedule. In con- 
trast to the annealed and quenched Feynman-Kac models discussed in Sec- 
tion 2.6, the word “annealed” is not related to an integration with respect 
to the law of a given random medium but on a freezing medium. We have 
chosen to examine this question for updated rather than for prediction 
Feynman-Kac models. There are two main reasons that explain this choice. 

First, the stability properties of Feynman-Kac semigroups presented in 
Section 4.3 and in Section 4.4 seem to indicate that the updated semigroups 
have “better” contraction properties than the prediction ones. For instance, 
when the imderlying Markov kernel is suflSciently mixing, the contraction 
coefficients of the updated semigroup do not depend on the potential func- 
tion (compare for instance Proposition 4.3.6 and Proposition 4.4.3). On the 
other hand, the prediction models are connected to the updated ones by a 
simple Markov integral operation. We can use this observation to transfer 
annealed properties of updated models to prediction models. In the reverse 
angle, the probability mass repartition of updated distributions is better 
related to the energy function. We will examine two types of Feynman-Kac 
models. 

The first one is the Feynman-Kac-Metropolis model introduced in Sec- 
tion 5.5. We recall that this model can be interpreted as an interacting 
Metropolis Markov chain admitting a given distribution as an invariant 




188 



6. Annealing Properties 



measure. The interaction comes from the fact that the elementary tran- 
sitions of this nonhomogeneous Markov chain depend on the laws of the 
random states. This model is defined in terms of a Metropolis potential 
function and a Markov kernel on the space of all possible transitions. Sup- 
pose the invariant measure is the Gibbs distribution associated with an 
energy function. In this situation, the Metropolis potential is a measure of 
the difference of energy in a given elementary transition. In this context, 
the cooling schedule corresponds to the selection pressure in the random 
exploration search of the space of transitions. In this connection, the re- 
siilting Markov chain can be interpreted as a nonlinear simulated annealing 
model. In this context, it is important to find sufiScient conditions on the 
exploration transitions and on the cooling schedule that ensure that the 
m^el concentrates on the global minima of the energy function. One ad- 
vantage of this model is that the limiting measure is a Gibbs measure. As 
a result, the concentration properties of the annealed model are reduced to 
the contraction properties of its semigroup. 

The second model studied in this chapter is related to the trapping prob- 
lem described in Section 2.5.1. In this context, the Feynman-Kac model rep- 
resents the law evolution of a particle motion in an absorbing environment. 
In this situation, the obstacles are related to an energy function and the 
cooling schedule is interpreted as the strength of the obstacles. The more 
the temperature decreases, the more stringent become the obstacles. In this 
situation, the exploration transition Immel of the particle and the energy 
function are dictated the physical problem at hand. This trapping prob- 
lem was formulated and studied in a recent article [101]. The asymptotic 
concentration regions correspond to the limiting energy levels visited by 
the particle. In contrast to the first model, the main difficulty here comes 
from the fact that the invariant measure of the Feynman-Kac model is 
generally unknown and the concentration analysis is much more involved. 
We already mention that in some situations a particle with off-diagonal 
transitions may be asymptotically attracted by some trapping regions. 

This chapter is organized as follows. Section 6.2 focuses on the annealed 
properties of Feynman-Kac-Metropolis models. As mentioned earlier, the 
corresponding annealed model can be regarded as a nonlinear simulated an- 
nealing random search. For traditional linear models, it is known that for 
judicious logarithmic cooling schedules the random search concentrates in 
probability to the global minima of the energy function. We show that the 
nonlinear model has the same concentration properties for any “sublinear” 
temperature schedules. In Section 6.3, we discuss the annealed properties of 
a Feynman-Kac model arising in trapping anal}T8is. In Section 6.3.2, we ex- 
hibit two different types of cooling schedules for which the annealed model 
has the same concentration property as the invariant distribution fiow. In 
Section 6.3.4, we characterize the limiting concentration regions in terms 
of a variational problem in distribution space. We show that the concen- 
tration level is related to a competition between the exploration transition 




6.2 Feynman-Kac-Metropolis Models 189 



and the selection potential. In Section 6.3.5, we show that annealed models 
with diagonal exploration transitions do concentrate on the global minima 
of the energy function. We also provide an off-diagonal model in finite space 
that concentrates on an obstacle. 



6.2 Fe 5 niman-Kac-Metropolis Models 



6.2.1 Description of the Model 

Let y be a nonnegative and boimded potential function on some measur- 
able space (5,<S) and let i/ € V{S), We consider the collection of Gibbs 
measures ttq, a € R+, on S defined by 

The parameter a is interpreted as an inverse temperatiure parameter. We 
associate with a given pair of Markov kernels {K, L) on S the distributions 
{i/ X K)\ and (i/ x Vj 2 on the product space E = S x S. We recall that 
these distributions are defined by 

{i/ X K)i{d(y,y')) = i/{dy) K{y,dy’) 

(i/xL)2(d(y,y')) = v{dy') L{y' ,dy) 



We further require that the triplet (z/, L) be chosen such that for any 
e E we have 



and 



d{i/ X K)i 



d{v X K)i 



for some fimte g < oo. We denote by Ga the Radon-Nikodym derivative of 
(TTa X L)2 with respect to {iTa X /f)i 

^ d{lTa X L)2 
“ d(7Ta X K)i 

Notice that Ga can be alternatively rewritten as 

Ga(y.y') = with = 



Under our assumptions on (i/, K, L, V), we observe that for any x,x' E E 



Ga(x) < 9 Ga(x') 



This shows that the collection of potential functions Ga satisfies the con- 
dition (G) introduced on page 115 with 

e(G„) > c-2“ 




190 6. Annealing Properties 



To go one step further, we suppose that the Markov kernel L satisfies the 
regularity condition {L)m presented on page 182. That is, we have, for some 
m > 1 , e(L) > 0 and for any y,y' € S 

> e(L)L”*(y',.) 



We recall that under {L)m the Markov kernel L has a unique invariant 
measure t/i = i/z,L € P(5) and for any n € N we have that 






Finally, we further require that the pair of measures (i/, vl) be chosen so 
that V and vl are mutually absolutely continuous and 



h{vL,v) 



inf 

W.»'6S 




>0 



Under these conditions, we know from Theorem 5.5.4 that the distributions 
(ttq X L) 2 , a € R+, are the unique fixed points of the mappings 



: V{E) V{E) 



defined for any (p, /) € iJ^{E) x Bb{E)) by 



$a(M)(/) = 



/xM^(Gg /) 



where is the Markov kernel from E into itself defined by 



M^{{y,y'U{z,z')) = Sy^{dz) K{z,dz') 



To simplify the presentation, we shghtly abuse the notation and denote by 
a : N R 4 . a given nondecreasing function. The annealed Feynman-Kac- 
Metropolis model associated with the pair (Ga,Af^) is the distribution 
flow ffn € V{E) defined by the recursive equation 

Vn = $a(n)(^n-l) 

with a given initial distribution % on E. We emphasize that rjn is altemar 
tively defined by the Feynman-Kac formulae 



W) = 7n(/)/%(l) and 7 n(/)=E^(^/(X„) nG„(,)(Xp)j 

where E^(*) represents the expectation with respect to the law of a 
Markov chain Xn with initial distribution fjo and Markov transitions 
In this context, represents an exploration Markov kernel on the 
transition space E = and Ga(n) represents the energy landscape at 




6.2 Feynman-Kac-Metropolis Models 191 

temperature l/a(n). The precise description of the nonlinear simulated 
annealing algorithm associated with this flow can be found in Section 5.5 
on page 168. 

In this section, we consider the problem of finding a nondecreasing cool- 
ing schedule a such that 

lim ||»7n - X i')2||tv = 0 (6.1) 

n— >oo 

Since the distributions tTq concentrate as a -> oo to the set of i/-essential 
infima of the energy function V, the preceding convergence result implies 
that the annealed Feynman-Kac-Metropolis model has the same concentrar 
tion properties. More precisely, any McKean interpretation of this annealed 
model converges in probability to the set of {u x L) 2 -essential infima of the 
energy function (1 0 V) on E, 

We decompose this problem into two parts. First we estimate the relax- 
ation times of the semigroups 

associated with the one-step mappings <&«, a G R-j.. The second step con- 
sists in estimating the oscillations of the mappings 

a G R+ — > (TTa X L )2 G V{E) 

with respect to the total variation distance. These two regularity proper- 
ties are studied in Section 6.2.2. In Section 6.2.3, we prove that we have 
the desired convergence result (6.1) for any “sublinear” increasing cooling 
schedule a. 



6.2.2 Regularity Properties 

To clarify the presentation, it is convenient to introduce the nonnegative 
and finite constants 



bm{L) 

cl{v) 



1 

log/?(L”*) 

1 - log i/)/4) 



with the convention bm{L) = 0 when PiL”') = 0 and for any a G R+ 
A(q) = (m + 1) (2 + [6m(i) (cl(»^) + a osc(V))J) 




192 



6. Annealing Properties 



Lemma 6.2.1 For any positive measure X on a measurable space {E^£) 
and for any measurable function U : E such that A(e"^) > 0, we set 

A(dx) with Zu = A(e""^) 

Then for any pair of nonnegative measurable functions such that 

> 0, we have that 

2 hui ~ /ic/alltv < osc(f/i - U 2 ) 



Proof: 

We use the decomposition 

pUi = ^Ui c ^ U U\+U'. 



and the fact that 



^Ui ^Hi±M2 






exp 



-~sup(£/2~I/i) 

^ E 



to check that for any A € 5 we have 



(A) > u uj+un (A) exp 



--OSc(f/2-C/i) 



Since we have osc (£/2 “ C^i) = osc(£/i — 1 / 2 ), by symmetry arguments we 
prove that for any Ae£ 



MC/ 2 (> 1 ) > U Ur+Uri M) exp 
Using (4.5) in Lenuna 4.2.2, we conclude that 



_-osc(C/ 2 -C/i) 



llw, -MUalltv < 1-exp 



--oac{U 2 -Ui) 



< - 08c(t/i - CI2) 



Proposition 6.2.1 For any oi < 02 , we have 



||(7ra, X 1)2 - (Xq, X £-) 2 ||tv < ^ (o 2 - Qfl) 08 c(V) (6.2) 

For any o € R+ and for any pair of diatribxitions {p, rj) on E, we have 



(6.3) 




6.2 Feynman-Kac-Metropolis Models 193 



Proof: 

To prove the first assertion, we observe that for each a G R+ the measure 
(TTa X L )2 on £? = 5^ can be rewritten in the Gibbs form 

(7Ta X L) 2 {dx) = A(dx) 

with the function U : E -¥R+ and the measure Xon E defined by 
U{y,y') = V{y') and \ = {uxL )2 

Thus, the proof of the first assertion is a clear consequence of Lemma 6.2.1. 
To prove the contraction estimate, we use Theorem 5.5.4 to first check that 
for any a € R+ 

Since we also have that 
h{vL,ira) > e-“ h(vL, u) and 
we conclude that 

<4e-3(L)/»K,i/)-i e<^osc(V) ||;x-T?||tv 

^ 

The case 0{L^) = 0 corresponds to the situation where = (^a x L )2 

for any n > m and /x G V{E). In this situation, we have A(a) = 2(m -f 1) 
and the proof of (6.3) is trivial. If (3{L'^) > 0, we observe that 

[4c-3(L)h(i/L,i/)-i] 

•4=> 1 + a osc(y) < kb^{L) + 1 — cl{v) 

<=^ k > bm{L) {cl{u) + a oec(V)) 

The end of the proof of the proposition is now clear. ■ 



6.2.S Asymptotic Behavior 

We use the same notations as in Section 6.2.2. Let a' : N -> R+ be a 
nondecreasing function. We associate with a' the time mesh 

t{n + 1) = t{n) + A'(n) with t(0) = 0 , A'(n) = A(a'(n)) 




194 



6. Annealing Properties 



and the piecewise constant cooling schedule a : N — {0} R+ given for 
any n > 1 by 

a(p) = a'(n) for t(n) <p<t(n -hi) 



By construction, the annealed Feynman-Kac flow f/p associated with the 
preceding cooling schedule is defined for any n G N by 

J/p = $a'(n)(Vp-i) for each t(n) <p<t(n -hi) 

In other words, is the Feynman-Kac model with a constant inverse tem- 
perature parameter a'(n) between the dates t(n) and t(n 1). That is, we 
have for each 0 < p < tm(n -f- 1) — t(n) 



Vt(n)-^p(f) — 



n;..G..(.i(x,)) 



Theorem 6.2.1 For any nondecreasing cooling schedule a' we have 



lim (a'(n + 1) - oi'(n)) = 0 



lim I 

n—^oo 



7«„ - (^a'(n) X ■t')2||tv = 0 



In particular, if we choose for some a € (0, 1) a'(n) = (n + 1)“, then we 
have f(n) = and for any n € N 

(n + 1)^““ ||^t„ - (lTa'(n) X i')2||tv ^ ^ + y 

For the proof of this theorem, we need the following lenuna. 

Lemma 6.2.2 (Tbeplitz-Kronecker) For any sequence of strictly posi- 
tive numbers On and for any converging sequence of numbers x„, we have 



On = oo and 

n>l 



lim Xn = X 

n—¥oo 



lim 

n— ►(» 



ITp=l ^P 
Ep=l ^ 



= X 



Whenever On is strictly increasing, we have 



lim On = 00 and 

n-¥oo 



Y^Xn<00 

n>l 




Xp = 0 



Proof: 

For any € > 0, we first choose an integer n(c) > 1 such that |x„ - x| < e for 
any n > n(c). Since $3p=i converges to infinity as n oo, we can also 

find an integer n'(e) > 1 such that 53p=i l®p “ ^ c Op. Now 

we use the estimate 



X^p=i 

E n 

p=i ®p 






Op kp - 




6.2 Feynman-Kac-Metropolis Models 195 



For any n > n{e) V n'(c), we have that 



n n 

Op \Xp -x\ = 5 ^ Op |Xp - x| + Y2 “p I®P “ 

p=l p=l p=n(c)-fl 

n'(e) 



< C 



®p+ ^ 



P=1 p=n(€)+l 

This yields that for any n > n(c) V n'(c) 

I I^p=i 



<2c Op 

P=1 



^P=l 



— X 



<2c 



and the proof of the first assertion is completed. To prove the second one, 
we put 



Xn = Y^Xp and Qn = Y2 ^ 



P=1 P=1 

and, for any sequence of numbers Un, A(w)n = tin — tin-i* FVom previous 
calculations, we get 



(^n l) 

= (ttn ■“ ^—1 l) “ ^n— 1 (®n ®n— l) 

— A(ax)n ~ 1 AOn 



Therefore we have that flp Xp = On Xn - a^p-i Aup, from 

which we conclude that 




1 ” 

0>p Xp — Xfi ^ ^ Xp^\ ^Q>p 

P=1 



Since On is strictly positive, we have an = J^p-i Aup > 0. We deduce from 
the first assertion that 



1 

lim — Xp_i Attp = lim Xn 

n-¥oo On ^ ^ n-4oo 

^ p=l 



Using the preceding decomposition, this implies that 
0. This ends the proof of the lemma. 



lim 

n-foo 



1 ^ 








196 6. Annealing Properties 



Proof of Theorem 6.2.1: 

The proof of the first assertion results firom the contraction and oscillation 
properties presented in Proposition 6.2.1. We use the decomposition 

Vt(n-^l) “ (‘^a'(n+l) ^ ^)2 

= - (5Ta'(n) X L)2] + [(jTa/(„) X L)2 ~ (?Ta/(„+i) X 1 ) 2 ] 

Since we have 

(7T„,(n) X L)2 = X L)^) 

then by (6.3) we prove that 

ll^f^'(n) ^ ^ \\fk(n) - (lTa'(n) X L)2||tv 

On the other hand, using (6.2) we find that 

||(7ra/(„) X L )2 - (JTa'(„+i) X L) 2 ||tv < ^ (tt'(n + 1) - a'(n)) osc(V) 

If we put 

In = ll^n - (’Ta'(n) X L)2||tv 

from the preceding estimates, we find that 

In+I < - /n + 5 (a'(n + 1) - a'(n)) osc(V) 
e 2 

By simple calculations, we conclude that 



In+I <1+1 OSC(V) ^eP(a'(p+ 1) - a'(p)) 

p=0 



By the Toeplitz lemma, we have 

Urn (a'(n + 1) - a'(n)) = 0 =>• to ^ (a'(p + 1) - a'(p)) = 0 

Ti— f 00 n— foo ■ ^ y ^ 

p=0 ■<^9=0 



Since e ]Cj=o e’ = (1 - e - 1), we readily get 

to (a'(n + 1) - Q!^(n)) = 0 to /„ = 0 



n-foo 



To prove the second assertion, we recall that x“ - < oy““^ (x - y) for 

any x,y>0, and from this inequality we get 



5^e»>((p+l)“-p“)<oX; 

P=1 



P 



ae 







6.3 Feynman-Kac TVapping Models 197 



Next, we observe that for any p>2 

gP-2 ^ I gp-i pi-a ^ 2^-" ^ 2 

(p_ 1)1-0 ep^“® (p- l)^*"" “ e ” ep^”® 



Therefore we have 

E 4^ < 

^ pl-o - 

p=2 ^ 



This yields that 



(l-2/e)-i 




eP-^ \ ^ 2e" 
(p — 1)^““/ “ n*~“ 






1 

< l + iosc(V) J^eP((p+l)“-p“) 

P=1 

fle 

< 1 + y Oec(V) + ae OSc(K) Y) ~ l-a 



Recalling that ne”" < 1, for any n > 1, we conclude that 
(n -h 1)^“® /n+i < 1/e 4- a (e -I- 1/2) osc(F) 

To prove that t{n) = we use the fact that 

+ (clM + (p+iro8c(V)) 

This impUes that 

t(n) = £‘a'(p) 

p=0 

< (m + 1) ([3 + bm{L) Cl{v)\ n + bm{L)o6c{V) n‘+“) 

< (m + 1) n'+“ (3 + 6„(L) {cl{u) + osc(V))) 

This establishes the theorem. ■ 



6.3 Feynman-Kac TVapping Models 

6.3. 1 Description of the Model 

Let M be a Markov kernel on some measurable space (EyS). Also, let V 
be a nonnegative and measurable function on E with bounded oscillations 




198 6. Annealing Properties 



osc(V) < oo and let a : N ^ R+ be a nondecreasing function. We associate 
with the triplet (a, V, M) the annealed Feynman-Kac updated model 

Vn ~ 



where ^ is an arbitrary distribution on E and a' € R+, is the collec- 
tion of mappings 

: V{E) P{E) 

defined for any (p, /) € {“PiE) x Bb{E)) by 



^a'(p)(/) = 



/) 



Notice that t)„ is alternatively defined by the Feymnan-Kac formulae 
Uf) = %{f)/%{^) and 7 n(/) = E%(/(Xn)e-^p"=>“(»»)’'W) 



where Ejj^(.) represents the expectation with respect to the law of a 
Markov chain Xn with initial distribution fjo and Markov transitions M. 

This model arises in a natural way in trapping analysis. We refer the 
reader to Section 2.5.1 for a detailed discussion on this subject. In this con- 
text, the Markov kernel M represents the transitions of a particle evolving 
in a medium E. The potential function V represents the energy landscape 
and the strength of the obstacles. The exponential term rep- 

resents the probability at which the particle at site x is not absorbed. In 
this interpretation, the cooling schedule represents the temperature of the 
medium. The more the temperature decreases, the more stringent become 
the obstacles. 



6.3.2 Regularity Properties 

This section is concerned with the regularity properties of the semigroups 
a € R+. This question is clearly connected with the study of the 
contraction properties of updated Feynman-Kac semigroups presented in 
Section 4.4. In our further development, we assume that the Markov kernel 
M satisfies condition {M)m for some integer parameter m > 1 and some 
e(M ) > 0. That is, we have for any pair (x, x*) £ EP 

M”*(x,.) > e(M) M"‘(x', .) 

To clarify the presentation, we introduce the nonnegative constants 

c(M) = l-log(c(A/)/2) 

S(m) = (m - 1) osc(V) 



and for any a € R+ 

A(a) = m (1 -h -1 -c(M))/€2(M)J) 




6.3 Feynman-Kac TVapping Models 199 



Notice that for m = 1 we have <J(m) = 0 and 

A(a) = A(0) = (1 + Lc(M)/c2(M)J) 

We observe that for any fixed a G R+ the semigroup defined by the induc- 
tive formulae 

is the updated Feynman-Kac semigroup associated with the pair of po- 
tential/kemel Since V has finite oscillations we see that the 

time-homogeneous potential function Ga = satisfies condition (G) 
with c(G) = osc(V) t^^ve for any (x,x') G 

Ga{x) > Ga(x') 

Using Proposition 4.4.3, we get the contraction inequality 

||$nm(^) _$nm(^)lltv < 2e~\M) (l - e^M) ||/x-7,||tv 

for any pair (/i, r)) G V{E)^ and for any n G N. By the Banach fixed point 
theorem we conclude that each mapping has a unique fixed point 

Ma = ^a(/Xa) € V{E) 

Proposition 6.3.1 For any a G R^_ and for any pair {fiyrj) G V{E)^, we 
have 

i 11/^ - T?||tv (6.4) 

In addition, for any a\ < 02 , we have the oscillation estimate 
||A»a, - Maalltv < 08 c(V) A(ai) (tt 2 - tti) 



Proof: 

In view of the preceding contraction estimate, we have 

< 2e-\M) exp (o<J(m) - ||a: - »?l|tv 

Since 

t^{M) A(a) > m e“‘'(”*>(o<J(m) + c(M)) 

a<5(m) - €^{M)^^ = iog(£(M)/2) - 1 

m 



we get 




200 6. Annealing Properties 



from which the proof of (6.4) is clear. To prove the second assertion, we 
use the decomposition 

Ato, - Ata, = $a/“‘^(A^a,) ~ ~ 

By (6.4), we find that 

llA^a, - A^aJItv < ~ A^aJItv + ~ 

from which we conclude that 

ll/Xa, - A^aJItv < ^ ll$^/‘“>(AiaJ - 



It is convenient to recall at this stage that for any fixed parameter 02 € R+ 
and for any a € R+, n G N, / G Bb{E), we have 






> _ K^(f(Xn) ’'I*'') 



We also see that each distribution is the n-time marginal of the 

Gibbs-Boltzmann measure on defined by 






e~oV„(x) 

^(n)(g-aV„) 






with the reference distribution A^^ and the potential Vn firom E" into R+ 
defined for any x = (xi, . . . , Xn) by 



Vn(3/l, • . • ) Xn) 
A(”>(d(xx,...,Xn)) 






P=1 



(AiajAf)(dxi) M(ii,dx2)... M(x„_i,dx„) 



By Lemma6.2.1, we have for any a\ < Q 2 

llAii*;) - A^il^lltv < ^ («2 - ai) oec(K) < | (02 - oi) osc(V^) 

We conclude that 

< ^(a2-Ol)06c(K) 

We end the proof of the proposition using the boimd e < 2(e — 1). ■ 




6.3 Feymnan-Kac TVapping Models 201 



6.3.3 Asymptotic Behavior 

We use the same notations as in Section 6.3.2. To define the annealed 
model, we associate with a given nondecreasing function a' : N -4 R+ the 
time mesh 

t{n -h 1) = t{n) -I- A(a'(n)) with t(0) = 0 , A'(n) = A(a'(n)) 

and the piecewise constant cooling schedule a : N - {0} -> R+ given for 
any n > 1 by 



a{p) = a'(n) for t(n) < p < t(n -f 1) 

The annealed Feynman-Kac fiow r}p associated with this cooling schedule 
is defined for any n € N and t{n) < p < t(n -h 1) by 

Vp ~ ^a'{n){Vp—l) 

We emphasize that fjp is the Feynman-Kac model with a constant inverse 
temperature parameter a'(n) between the dates t(n) and t(n-h 1). For each 
0 < p < A'(n), we have that 



Vt(n)-^p{f) 









*1 V(X,)>^ 



To connect the uniform contraction estimates with the oscillations of the 
fixed point measures presented in Proposition 6.3.1, we introduce the de- 
composition 



^t(n+l) Ma'(n-l-l) — ^Q'(nj^(/^a'(n))) 



T VW(n) “ 



Using Proposition 6.3.1, we find that 



||^t(n+l) ”Ma'(n+l)||tv 

^ \ ll^t(n) -A‘a'(n)l|tv + oec(V) A'(n) (a'(n + 1) - a'(n)) 

and thus it appears that 

l|»7t(n+l) -/ia'(n+l)l|tv 

< 1 + oec(V) Ep=o A'(p) (a'(p + 1) - a'(p)) (6.5) 



We are now in a position to state the main result of this section. 




202 



6. Annealing Properties 



Theorem 6.3.1 Suppose condition {M)m is satisfied for some m > 1. 
Then we have 



ll%(n) -/ia'(n)l|tv =0 

for any increasing cooling schedule a' such that 

lim + a'{n)S{m)) [a'(n + 1) - a'(n)] = 0 



We have the two distinguished cases 

• If m = 1, then we have t{n) = 0(n) and we can choose for any 

a 6(0,1) 

a'(n) = (n-hl)" 

In this case, we have for some c{a) < oo and any n > 1 

Il%(n) - Ata'(n) lltv < c(o)/n‘-“ (6.6) 



• Ifm>l, then we can choose 

a'(n) = a'(0) log (n + e) , with b = 5(m)a'(0) < 1 

In this case we have t{n) = 0{n^^ logn) and for some c{b) < oo and 
any n > 1 

ll»7t(n) - Ma'(n)l|tv < c(6) logn 

Proof: 

If we put /„ = ||^t(„) - /ia'(n)lltv, then by (6.5) we have 

In+x < 1 + 06 c(I^) ^'(P) («'(P+ 1) - «'(P)) 

p=0 

By the definition of A'(p), we have 

A'(p) < m(2 + e^("‘)“'(»’)[o'(p)<J(m) + c(M)]/e2(M)) 

< m (2 + c'(M)e^("‘)“'(P)[l + a'(p)5(m)]) 

with cf{M) = (1 V c{M))/e^{M). This readily yields that 

A'(p) < 3m c'(M) e^(»")“'(p)(l + a'{p)S{m)) 

and therefore 

/n+l 

< 1 + Cm{V, M) (1 + a'(p)6{m)) {a'(p + 1) - a'(p)) 




6.3 Feynman-Kac TVapping Models 203 



with Cmiy^M) = 3m d{M) osc(V) . We use the Toeplitz lemma as in the 
proof of Theorem 6.2.1 to prove the first assertion of the present theorem. 
Next we examine the two cases m = 1 and m > 1. When m = 1, we have 
S{m) = 0 and we find that 



n 

In+I < 1 + Ci{V, M) 53 (a'(p + 1) - a'{p)) 

p=0 

If we take a'(n) = (n -f 1)® for some a € (0, 1), then we argue as in the 
proof of Theorem 6.2.1 and we prove that 

/„+i <l + ci{V,M)ae (l + ) 

from which we conclude that 

(n + < 1 + 3ae ci{V, M) 

Also notice that in this situation we have A'(p) < 3c' (M) and therefore 

n— 1 

p=0 

This completes the proof of the second assertion. Next we examine the 
situation where m > 1. We nse the rather crude estimate 

/„+! 

< 1 + c^{V, M) (1 + a'(n)5(m)] (a'(p + 1) - a'(p)) 

If we choose a'(n) = a'(0) log (n + e) with b = S(m)a'(0) < 1, then we find 
that 

/„+! 

<l + (l + 6)c„,(V,M) log(n + e) Ep=o (P + e)' log(l + ^) 
Recalling that log (1 + x) < x, for all x € (0, oo) we prove that 

, pP+i 

/n+i <! + (! + b)c„,(v, M) log (« + «) E 

On the other hand, fi'om the estimates given in the proof of Theorem 6.2.1, 
we have 





204 



6. Annealing Properties 



Thus we find that 

/n+i < l + (l + 6)c„.(V,Af) log(n + e)^ ^ - ^, 3 ^ 
We conclude easily that 

We end the proof of noting that 

n—l n— 1 

S ^'(p) ^ 3”^ c'(M) (1 + 6) log (n + e) 

p=0 p=0 

< 3m c'(Af ) (1 + b) log (n + e) (n-h 



6.8.^ Large-Deviation Analysis 

This section is concerned with the concentration properties of the fixed 
point distributions (Iq as a tends to infinity. We use large deviation sir- 
guments, and it is convenient to reduce the analysis to Polish state-space 
models. More precisely, we further assume that £ is a separate topologi- 
cal space whose topology is generated by a metric that is supposed to be 
complete. We also assume that V is a continuous and bounded potential 
function on E, 

The interplay between /ia the quantities (a, Af, V) is described by 
the fixed point formula 

Paif)=Pa{Qa{f))MQa(l)) with Qa(/) = M(e-“V) 

Under the uniform mixing condition {M)m, we recall that the Markov ker- 
nel M has a unique invariant measure v = vM € P{E), and the sequence of 
occupation measures L„ = ^ I2p=i ^Xp of the chain X„ under satisfies 
as n oo a large deviation principle with good rate function 

I{fi) = inf p{dx) Ent{K(x , .) | M(x, .))| (6.7) 

where the infimum is taken over all Markov kernels K with invariant mea- 
sure n- 

In the most naive view, we could think that the Feynman-Kac simulated 
annealing model converges in probability to the i/-essential infimum Vp of 
the potential V defined by 

Vv = sup {v € R+ ; V > r v a.e.} 




6.3 Feynman-Kac TVapping Models 205 



This intuitive idea appears to be true for regular Markov transitions M with 
a diagonal term M (x, x) > 0, but it is false in more general situations. 

To better introduce our strategy to study the concentration properties 
of /ia, we need a more physical interpretation of the Feynman-Kac models. 
If we interpret the potential V as the absorption rate for a Markov particle 
with transition M evolving in a medium with obstacles, the normalizing 
constant 

represents the probability that a Markov particle starting with distribu- 
tion fjo performs a long crossing of length n without being absorbed. For a 
more precise description of this interpretation, we refer the reader to Sec- 
tion 2.5.1. The cost attached to performing long crossings is measured in 
terms of the logarithmic Lyapunov exponents A(-aV’) of the semigroup 
Qa on the Banach space Bt{E) defined by the formulae 

A(-aV') = lim ilog||Q”(l)|| = lim - logsupEx(e""^p=i 

n— >oo fl n— ►oo fi ^ 

The next lemma shows that these Lyapimov exponents coincide with 
the exponential moments of the fixed point measures /ia. It also enters the 
large-deviation rate I in the concentration properties of /ia. Informally it 
shows that 

where Vj is the value of the variational problem 

Vi = inf {/i(V) ; /i € V{E) s.t. /(/i) < oo} (6.8) 

Loosely speaking, the concentration properties of the limiting measures fia 
as a tends to infinity are related to a competition in V{E) between the 
mean potential /i(V') and the /-entropy J(/i). Recall that /(/i) < oo iff we 
can find a kernel K such that /i = fxK and K{Xy .) << Af(x, .). 

The next lemma also shows that the concentration of /ia is related to a 
variational problem in which the competition with the entropy / becomes 
less and less severe as a tends to infinity. 

Lemma 6.3.1 For any a € R+, we have the formulae 

_ A(^ = - logfiaie^^) = M Uv) + ^ y, > K 

a a v€V(E) \ a / ^ oo 

Proof: ^ 

If we take / = point formula, we readily find the recursive 

equation 

Ma(Qr'(D) = /^a(QS(l)) f^aiQail)) 




206 6. Annealing Properties 



Thus we have for each n > 0 

Ma(Q2(l)) = (Ma(4(l)))" = V(Xp)) (g 9) 

Now if we take / = in the fixed point equation, we get 

fiaiQail)) = 1 ( 6 . 10 ) 

Recalling that under condition {M)m the Laplace transformation 

A(-oF)= lim 

n-¥oo n 

doesn’t depend on the choice of the initial distribution /x, we deduce that 
-A(-aV) = - logMa(Q(l)) = log/ia(e“^) 

Since A(— aV) is also given as the Fenchel transformation of I 

A(-aV) = sup {r]{-aV) - I{ri)) (611) 

VGV(E) 

the end of the proof of the first assertion is clear. To end the proof, we note 
that 

< inf Uv) + h{r,)) < T,{V) + h(r,) 

v^v(E) \ a / a 

for each distribution 7/ such that I{r)) < 00 . Letting a 00 , we find that 

V> < Umsup inf (f?(V) + ^Hv)) < v{V) 
a-^oo V€V(E) \ OL J 

Taking the infimum over all distributions r} such that I{r)) < 00 , we obtain 

lim -log^a(e“'') = V7 

a-¥oo a 

To see that V/ > K> it is clearly sufficient to show that for any probability 
H, I{fi) < +00 implies that < i/. One easy way to obtmn this assertion 
in our context is to note that if 7(/i) < +oo, then there exists a kernel K 
verifying n = fxK tmd K{x, •) < Af(x, •) for /ii-a.s. all x € £7. But since for 
all X € £, Af’"(x, •) is equivalent to i/, due to the condition {M)m, we get 
that fi = 4 ; /iM"* ~ u. This ends the proof of the lemma. ■ 

Using the exponential version of Markov’s inequality, Lemma 6.3.1 pro- 
vides a concentration property of fia in the level sets {V <Vj + 5), 6 >0. 
More precisely, we have for any > 0 

fiaiV >Vi + S) = > e“^) < 




6.3 Feynman-Kac IVapping Models 207 



One concludes that 

Um -logfiaiV >Vi + S)<-6 

a-¥oo O 

Combining this concentration property with Theorem 6.3.1, we prove the 
following as 3 nnptotic convergence result. 

Proposition 6.3.2 Suppose condition {M)m holds true for some m > 1, 
and let t{n) and a'{n) be respectively the time mesh sequence and the cool- 
ing schedule described in Theorem 6.3.1. Then the corresponding annealed 
Feynman-Kac distribution flow ^t(„) concentrates asn-¥ ooto regions with 
potential less than Vj; that is, for each 5 >0, we have that 



lim ^f(„)(V > V} + (J) = 0 



The topological hypotheses that E is Polish and that V is continuous are 
only necessary to obtain (6.11); see for instance [112]. So except for the 
definition (6.8), the concentration analysis developed in this section is true 
under the assumptions that {E,€) is a measurable space and V is a non- 
negative bounded and measurable potential. In particular, under this ex- 
tended setting, we can consider 



v; =def. 



.. 1 .. 1 , » 
lim — hm -logEi 

a— ►+00 Ot n— ^00 ft 




which always exists and does depend on the initial condition x € E. Indeed, 
if we denote for all n € N and a 6 R+ 



An(a) 



inf log Ej 




then it is quite clear via the Markov property that (An(tt))n€N is super- 
additive so that the following limit exists: 

A(q) =dcf. lim -An (a) 

n-foo fi 

(this is just a rewriting of the traditional existence of the Lyapunov expo- 
nent of the underlying unnormalized Feynman-Kac operator). Now taking 
into account condition {M)m, it appears that for any n>m and i, x' € E, 

Erlexp(-aZ;=iV(X,))] 



> le^exp(-(m - l)ao6c(V))| E*-[exp(-aX;p=i V(Xp))j 



thus we see that 



1 /E.[exp(-aE;=imp))l\ 

nToo n \E,,[exp(-aEp=i V{X,))] ) 



= 0 




208 6. Annealing Properties 



In particular, for any initial distribution we have 

A(«) = _Iim ilog^E,, 

As a limit of convex functions, the l.h.s. term in the preceding display is a 
convex function in a. Thus, we are ensured of the existence of 

. ta M = . ito = -,upMzM 

0-++00 O a ->+00 a a >0 « 

a priori in R U {-oo}, but as V is nonnegative and bounded, we conclude 
that V, € R+. In this cont^. Lemma 6.3.1 can be rewritten as saying that 
under the topological hypotheses that E is Polish and that V is continuous, 
we have V, = V} > Vp. 




6.3.5 Concentration Levels 

In this section, we discuss the concentration regions of /ia as a tends to 
infinity. In a first subsection, we examine Feynman-Kac models where the 
Markov kernel M satisfies condition {M)m with m = 1 or has a regular di- 
agonal term. We show that in this case the concentration level Vj coincides 
with the essential infimimi of the potential with respect to the invariant 
measure of M. The second subsection focuses on Feynman-Kac models 
on finite state spaces. We relate the exponential concentration of fia with 
a collection of Bellman’s fixed point equations. We propose an alternative 
characterization of the concentration level Vj. We show that V/ can be seen 
as the minimal mean potential value over all closed cycles on E. Thanks 
to this representation, we prove that Vj = Vp iS there exists a closed cycle 
on V~^{Vp). For more general off-diagonsJ mutation transitions, we have 
Vi > Vp. We illustrate this assertion with a simple three-point example, 
showing furthermore that Ha does not concentrate on “neighborhoods” of 

Diagonal Mutations 

The easiest way to ensure that Vj = Vj, is to impose loops on every point 
of E for M. This assertion is based on the following simple upper bound. 

Proposition 6.3.3 

Vi <Vm = inf { V(x) , M(x, x) > 0} 



Proof: 

Let us prove that for any x e E with M(x,x) > 0 we have Vj < V(x). By 




6.3 Feynman-Kac IVapping Models 209 



the definition of the Markov chain X, we find that for any a € R and any 
E,[exp(-aE”=iV(^p))l 
^ Ei[lxi=*. Xi=x, ••• , Xn=x exp(— o ^__i V^(Xp))] 



= (M(x,x))”exp(-naV(x)) 
This yields that 

A(-aV) > lim ilogE, 

n-»oo n 



exp^- 

> log M (x, x) - aV (x) 
firom which we conclude that 

V/ = - lim < V{x) 

This ends the proof of the proposition. 






As a simple corollary, we have V/ = Vj, as soon as we can find a sequence 
(®n)n€N such that lim„_,oo V(x„) = with Af (x„,x„) > 0 for all n € N. 
This clearly holds true when M is chosen so that M(x,x) > 0 for any 
X £ E. 

Also notice that Vj = V„ aa soon as {M)m is satisfied for m = 1. To see 
this claim, we use the fixed point equation to check that, for any a € R+ 
and any noimegative measurable function f on E, we have 



e\M) 






< Ma(/) < 



1 t/(e~°^/) 

e^{M) u{e~°^) 



Finite State Space 

We further suppose that Af is an irreducible Markov kernel on a finite state 
space E. In this case, M has a unique invariant measure u and, for any x € 
E, we have i/(x) > 0. As an aside, we note that in this situation condition 
{M)m is met if and only if M is aperiodic. Our immediate objective is to 
give an explicit representation of V/. For any fi € V{E), we have that 



Vi 



1 1 . „ 
lim — lim -logE„ 

Q— ►+©© Of n-¥oo fl 




( 6 . 12 ) 




210 6. Annealing Properties 



Definition 6.3.1 A finite collection P = (yi,...,J/n) of elements of E is 
called an M-path of length 1{P) = n € N »/ for any 1 < i < n we have 
M(j/i,y<+i) > 0. The mean potential of an M-path P = (yi,...,yn) w 
defined by 

V{P)=a,f.^j2V{xi) 

An M -cycle of length n 6 N - {0} is on M-path (xi, • • • ,Xn) € E" such 
that Xi ^ Xi+i for any 1 < i < n and M(xn,xi) > 0. 

Proposition 6.3.4 



Vi = Vc=iet.^V{C) 

O GC 

In particular, we have Vj = K if and only if there exists an M-cycle inside 

V-'{V.). 

Proof: 

We first prove that V/ >Vc.LetP = (yi,...,y„)beanM-pathoflengthn € 
N. We can find k M-cycles Ci , C* and a subpath RofP (not necessarily 
of the form (yr,yr+i, —,yr+HR))) of length 1{R) less than card(E) such that 

liP)V{P) = Y. l{Ci)V{Pi)-hl{RMR) (6.13) 

l<i<k 

To be convinced of the existence of such a decomposition, we look for the 
first return of the path P on itself: let s = min{t > 2 : yt € {yi, ...,yt-i}} 
and 1 < r < s be such that y, = yr- Then we define 

=def. (yr> yr+ 1 ) •••> y«— l) 

and we consider the new path P' =def. (yi,— .yr-i,y»,y*+i,— ,yn) (one 
would have noted that Af(yr_i,y,) > 0). Next, recursively applying the 
previous procedure, we construct/remove the Af-cycles C 2 ,...,Cfc and we 
end up with a path R whose elements are all different. FVom formula (6.13), 
we deduce that 



1{P)V{P) > l(C'i)Fc-card(f;)||V|U 

l<i<k 

> l{P)Vc -2caxdmV\L 



Thus, for any x € E and n € N*, we have 



E. 



exp - 






< exp(naVc: — 2card(P)a ||V||oo) 



and the annoimced bound follows at once. To prove the reverse inequality, 
let us consider C € C such that V (C) = Vc- If an initial point x and a large 




6.3 Feynman-Kac IVapping Models 211 



enough length n are given, we construct a path P„ by first going from x to a 
point of C7 by a self-avoiding path (whose existence is ensured by irreducibil- 
ity) and next alwa}rs following C (in the direction included in its definition 
and jumping from its last element to the first one). Then it is qmte clear 
that lim„_»oo V’(Pn) = V(C), thus denoting q = minj;,„g£;. M{x,y) 

and taking into account the bound 



Ex 



exp 



-a'^ViX,) 

p=i 



> 9 ’‘exp(noV(P„)) 



We conclude by an argument similar to the one given in the proof of Propo- 
sition 6.3.4. ■ 



In fact, the equality of the preceding proposition remains valid if M 
admits a unique recurrence class (but in this situation u does not necessarily 
charge all points of E). In the most general case, the initial point x in (6.12) 
plays a role: V/(x) is the minimal mean potential of M-cycles included in 
the recurrence classes that can be reached from x. 

Remark 6.3.1 Let Ac be the set of positive functions f defined on E that 
are of the form f = Z)o€C Ic ^th (ac)cec e R+. /n view of the 
preceding result, we note Uiat 

Vj = inf{u{fV)/u{f) ; feAc} 

This expression should be compared with the general formula for V„: 

V, = mf{u{fV)/u{f)-, feA+} 

where *4+ denotes the set of positive bounded measurable functions defined 
on {E,£), 

To understand precisely the concentration phenomenon for pa we would 
like to obtain a large-deviation principle; that is, to find a function U : 
E such that for any x e E 

U{x) = - lim -log{pa{x)) 

a— ^+00 ct 

(necessarily minf; U = 0, in analogy with the generalized simulated anneal- 
ing, we would say that U is the virtuaJ energy). Unfortunately we have not 
been able to prove such a convergence, even under the condition (M)^, 
but we are still trying to get this result. Nevertheless, we note that under 
{M)m the family of mappings (log(/tQ(-))/o)Q>i is compact. Indeed, we 
have for any a > 0 and x € P, 

(^)^Mo(Q|(^ > e"(M) osc(V) »^( £ :^ 

Ma(Q?(l)) ■ 

> e^(M) °®^(^)i/(x) 




212 



6. Annealing Properties 



and therefore 

0 < - — log^a(x) < m osc(V) - — log ( e^(M)mini/(x) I 
a a \ x€E / 

We can consider the accumulation functions U of - log(^a(x))/a for large 
a. In order to derive the corresponding Bellman’s equations, we introduce 
for n € N* and x,y€E the n-communication cost function 

(x, y) =def. min V(P) 

where Pc"y is the set of M-paths of length n from x to y. In particular, 
for any x,y € E, V^^^(x,y) = V(y). As in the proof of Proposition 6.3.3, 
we prove that for any x,y € E, liminfn-foo (and this is a 

true limit if Af is aperiodic, the difference of the two terms being at most 
of order 1/n). 

For a subset Ac E, we also define the M-boundary of A as the subset 
of all possible sites that are accessible from A; that is, 

dM{A) = {y € E - A ; 3x € A M(x,y)>0} 

Now we can state the following proposition 

Proposition 6.3.5 LetU ERf be any accumulation point as above, then 
it satisfies the Bellman’s fixed point equations 

U{y) = inf ([/(x) + nW">(x, y)) - nV> (6.14) 

x^E 

for any n € N* and nVj = infx,y^E{U{x) + nV(x,y)). Furthermore, we 
have the inclusions 

U-\0)c{V<Vi) and dMU~\0) C {V > Vj) (6.15) 

Before getting into the proof of this proposition, let us pause for a while 
and give some comments on the consequence of these results. The inclu- 
sions (6.15) show that a point x € {V^ < Vj} with energy U{x) >0 cannot 
be reached from U~^{0) (the reverse being in general true). This shows 
that when aU pairs of points x,y € {V < V/} can be joined by a path in 
this level set, then C/~^(0) = {V < V}}. 

Proof of Proposition 6.3.5: 

Bellman’s equations are inunediate consequences of the fixed point equar 
tion (see the proof of Lemma 6.3.1). We have for amy n € N — {0}, x € E, 
and a > 0 

tia{x) = (/ic(exp(aV)])" /ic(y)E„ 1{,}(X„) exp f-a^K(Xp)'j 

y€E L V p=l /- 




6.3 Feynman-Kac IVtqppmg Models 213 



Taking the logarithm, dividing by a, and letting a tend to infinity, we get 
the desired formulae. To prove the inclusions (6.15), we suppose on the 
contrary that we can find a pair (x, y) € such that 

U{x) = 0, M(x,y)>0, U{y)>0, and V{y)<Vi 

FVom Bellman’s equation, this will give that 

U{y) = M{U{z) + V{y)-Vr, z€E, M{z,y)>0} 

< inf{t/( 2 ); M(«,y)>0}<l/(x) = 0 

and we obtain a contradiction with the fact that U{y) >0. ■ 



We end this section with a simple three-point example in which V/ > Vj, 
and V~^{yu) t/“*(0). So we take for state space E = {0,1,2} and we 

consider the Markov kernel defined by 

/ p 1-p 0 \ 

M= 0 0 1 with p€(0,l) 

V 1 0 0 / 



It is clear that M is irreducible and aperiodic, and we check that its unique 
invariant probability v is given by 



K0) = 



1 



and 1 /( 1 ) = 1 /( 2 ) = 



1-p 



3 - 2p ' ' ' ' 3 - 2p 

Let V : £ R+ be a potential function such that 
V(0) -I- K(l) -I- V(2) 



K(0)> 



> V{2) > 1^(1) = 0 



(6.16) 



So the i/-essential infimum Vj, is given by V,, = 0 = V^(l), and by Proposi- 
tion 6.3.4 we have 



Vi = (V(0)-hV(l)4-V(2))/3 

This could also be deduced from the fact that here the rate function / 
satisfies 

7(p) < 00 4=^ 3 r € [0, 1] : p = r(So + + S 2 )/S + (1 — t)Sq 

a property that reflects that trajectories of X are concatenations of the 
words [0] and [1,2,0] (except for a possible start with [2]). Our next objective 
is to solve explicitly Bellman’s fixed point equation (6.14) for n = 1: 

f U(0) = mm{U{0),U{‘^)} + V{0)-Vi 
{ U{1) = t/(0)-l-K(l)-V7 
I 1/(2) = U{l) + V{2)-Vi 




214 



6. Annealing Properties 



By (6.16), we see that in the first equality the mininnim cannot be U{0) 
(otherwise V(0) = V/), so 17(0) = 17(2) + V(0) - Vj and this shows that 
^(2) < 17(0). The last equation also implies that 17(2) < 17(1) and neces- 
sarily 17(2) = 0, from which we obtain that U is unique and that it is given 
by 

r 17(0) = V{0)-Vi 
{ £7(1) = Vi-V{2) 

[ £7(2) = 0 

One concludes that lima_,oo/ia(2) = 1 and that this convergence is expo- 
nentially fast. In particular, fia does not concentrate for large a on the 
unique point 1, where the “essential” infimum is achieved (this latter as- 
sertion could also be deduced directly from the observation (6.15)). 




7 

Asymptotic Behavior 



7.1 Introduction 

This duster provides an introduction to the asymptotic behavior of par- 
ticle approodmation models as the size of the systems and/or the time 
horizon tends to infinity. In the following pictiure, we have illustrated the 
random evolution of the simple AT-genetic approximation model described 
in (3.4). This picture gives a sound basis to the main questions related to 
the asymptotic analysis of the particle approximation scheme. 



Tfo T/i = $i(ito) 

Vo ^liVo) 

r,N 



V2 = *0,2(»?o) 

^oaiVo) 

^2{Vl) 



-> ^0,n{Vo) 

-> ^0,niVo) 

••• HniVl) 

■■■ ^2,ni^) 

Vn-i ^niVn-l) 

r,N 



Intuitively, we first observe that the sampling error (represented by the 
implication sign “D”) does not propagate but stabilizes as soon as the 
semigroup is sufficiently stable. This intuitive idea is made clear by 




216 7. Asymptotic Behavior 



the pivotal formula 

q=0 

with the convention $o(»?-i) = »/o for P = 0- Note that each term on the 
r.h.s. represents the propagation of the pth sampling local error ) =>• 

Tjp. This observation indicates that the numerical analysis of the particle 
algorithm or any numerical approximation model (based on local approx- 
imations) is intimately related to the stability property of the nonlinear 
semigroup of the limiting model. The picture also suggests that the fluc- 
tuations of the flow of local errors (properly renormalized) behave asymp- 
totically as a sequence of independent and identically distributed Gaussian 
random variables. These questions will be made clear in the further devel- 
opment of this chapter (see also [86, 87, 100, 220, 219]). 

The chapter is organized as follows. In Section 7.2, we provide a short 
discussion on Feynman-Kac models and their particle interpretations. We 
also take the opportunity to fix some of the notation and some regular- 
ity conditions we shall be using in the further development of this book. 
Section 7.3 focuses on independent sequences of random variables (which 
we shall abbreviate iid). In the first Section 7.3.1, we discuss some general 
inequalities such as a refined version of the inequalities of Khinchine/Bur- 
kholder/Marcinkiewicz-Zygmimd. We already mention that these original 
inequalities provide a natural and simple way to estimate the moment- 
generating functions of the empirical measures associated with independent 
random variables. In Section 7.3.2, we derive some more or less well-known 
Lp and exponential inequalities for empirical processes. These estimates 
extend the corresponding statements for sums of iid to the convergence of 
empirical processes with respect to some Zolotarev type seminorm. The in- 
equalities presented in this section will be of use in the further development 
of this chapter. Special attention is also paid to deriving as soon as possible 
precise and sharp inequalities. This choice is not only for mathematical ele- 
gance but in some instances it is essential to start with a precise estimate to 
work out another analytical result with exact rates of decay. For instance, 
the complement of the Lp-inequalities of Burkholder presented in this sec- 
tion provide precise and sharp constants. We will use this estimate in the 
proof of strong propagation-of-chaos estimates. In this particular situation, 
we propose a strategy of analysis in which the exact decay rates in the 
propagation-of-chaos are related to the precision of these Lp-inequalities. 

The strong law of large numbers for interacting particle systems is dis- 
cussed in Section 7.4. In Section 7.4.1 and Section 7.4.2, we study a fairly 
general class of interacting processes, including the situation where the po- 
tential functions may take null values and the algorithm may be stopped 
when the system dies. In this connection, we derive in Section 7.4.1 several 
types of exponential botmds to estimate the probability of extinction. We 




7.2 Some Preliminaries 217 



also mention that Section 7.4.2 contains some key martingale type decom- 
positions that are essential on our way to proving central limit theorems. 
The final Section 7.4.3 focuses on time-uniform estimates with respect to 
the time parameter. We examine this question from different angles related 
to a graduate set of regularity conditions. These estimates are probably one 
of the most important results in practice. They allow us to quantify the 
size of the particle approximation models that ensures a given precision. 



7.2 Some Preliminaries 

For the convenience of the reader, we have collected hereafter some es- 
sential results on Feymnan-Kac semigroups and their interacting particle 
interpretations. Let {En, €n)n>o he a collection of measurable spaces. For 
any p < n, we recall that Eip,„] = (£?p x . . . x E„) and E(p,„) = £?[p+i,n)- 
Also let e V{Eo) andM„(xn-i,dXn) be a sequence of Markov transitions 

from En-i into En, n > 1. We denote by Gn ' E„ (0, oo) a collection of 
nonnegative and bounded £n-measurable functions, and we associate with 
the triplet {rio,Gn,Mn) the Feynman-Kac measures 6 V{En) defined 
for any /„ € Bb{En) and n € N by the formulae 



»7n(/n)=7n(/n)/7n(l) with 7n(/n) = E,^ ^/n(X„) G„(Xp) j 

(7.1) 

where stands for the expectation with respect to the distribution of an 
En-valued Markov chain with transitions Afn. Without further men- 
tion, we will suppose that Gn satisfy condition (G) for some Cn(G) > 0 (see 
page 115). By the definition of no generality is lost and much conve- 
nience is gained by supposing, as will be done in this chapter and unless 
otherwise stated, that the potential functions Gn take values in (0, 1] (see 
also Section 2.5 and page 77 Section 2.5.3). We recall that the distribu- 
tion fiow 7/n satisfies the nonlinear equation r/n+i = $n+i(^n), where the 
mapping $n-fi * ViEn) V{En^i) is defined for any tj G V{En) by 

^n+iiv) = '*'n(»7)M„+i with ’J'„(p)(dx) = G„(x) 7/(dx) (7.2) 

We let Qp,n and $p,n> p < n, be the semigroups associated respectively 
with the Feynman-Kac distribution flows 7 „ and defined in (7.1), 

Qp,n ~ Qp+1 • • ■ Qn-lQn and ^p,n = ® ^n-1 O . . . O $p+i 

with Qn(xn-i,dXn) = Gn-iixn-i)Mn(xn-i,dxn). We use the Convention 
Qn,n = Id and $n,n = Id for p = n. We recall that $p,„ is a nonlinear 




218 7. Asymptotic Behavior 



integral operator from V{Ep) into V{En). For any (/ip,/n) € (P(£j») x 
Bb(En)), it can be written in terms of a Boltzmann-Gibbs transformation 

^p,n(/^)(/n) = t^{Gp,n P jt,n{fn))/t^{Gp,n) 

with the pair potential/transition {Gp,n,Pp,n) defined by 

Gp^n — Qp,n(l) ®nd P p,n(/n) == Qp,n{fn)/Qp,nO-) 

The next two parameters 

fp.n = 8Up (Gp,n(®p)/^»p,n(l/p)) 



and 

0{P ?,n) = sup ||fp,n(3!p) •) ~ Pp,niyp> Olltv (7-3) 

measure respectively the relative oscillations of the potential functions 
Gp,n and the contraction properties of the Marlmv transition Pp^„. Var- 
ious asymptotic estimates on particle models derived in the forthcoming 
sections will be expressed in terms of these parameters. 



7.2.1 McKean Interpretations 

The flow T}n can alternatively be described by a nonlinear equation of the 
form »^,+i = T)nEn+i,,,„, where V G V{E„), is a (nonunique) col- 

lection of Markov transitions satisfying the compatibility condition 

r)Kn+l,f, = 9„{r})Mn+l = ^n+l(»?) 

We associate with a given pair {qo, Kn,t)) the McKean measure K,^ on the 
canonical space (ft = ~ (■^n)n>o) with marginals 

KfjQ,n(d(xo, . . . «Xn)) 

= »?o(dxo) Ki,,^(xo,dXi) . . . K„_i,,„(x„_i,dXn) € ^(fi?[o,n|) 

Given a McKean measure, the Fe 3 mman-Kac fiow tjn can be interpreted 
as the (marginal) distributions of a nonhomogeneous Markov chain with 
transitions Kn,ri and initial distribution qo- The corresponding AT-particle 
model is defined as a sequence of nonhomogeneous and fi^^-valued Markov 
chains 

il(N) ^ JJ jrN ^ fN 

n€N 




7.2 Some Preliminaries 219 



The initial configuration consists of N independent and identically dis- 
tributed random variables with common law and its elementary transi- 
tions from into are given in a symbolic integral form by 

N 

(Cn € dXn 1 $n-l) = JJ ^n) 

p=l 

where m($n- i)=^Er=i^« and dXn = dx\ x . . . xx^ is an infinitesimal 
neighborhood of a point x„ = (xj^, . . . , x^) € 

Several examples of McKean models are described in Section 2.5.3. Two 
generic situations arising in practice can be underlined: 

• Case 1: K„+i,,(x, .) = $n+i(»?) 

• Case 2: K„+i,,(x, .) = G„(x) A/n+i(x, •) + (!- Gn(x)) ^n+iiv) 

We recall that these two situations belong to the same class of McKean 
transitions defined by 

K„+i,,(x, .) = e„(»/)G„(x) M„+i(x, .) + (!- £n{v)Gn{x)) ^n+i{v) 

for some constant eniv) depend on the current pair of parameters 

{n,T}) and such that en{‘r))Gn ^ 1- The two cases above correspond to the 
situation where, respectively, £n{v) = 0 ^n{v) — 1- 

Except of few situations, such as the fluctutaions on path-space and 
propagation-of-chaos analysis with respect to the total variation distance, 
the asymptotic theory developed in this book applies to any kind of McKean 
interpretation model. To give a practical sound basis to the forthcoming 
analysis, sometimes we illustrate our results on the two cases described 
above. We shall distinguish the corresponding particle and McKean models 
with mentioning that they are related to the first and second cases (McKean 
interpretation). 

7.2.2 Vanishing Potentials 

Let us now take up the problem where the potential functions may vanish 
on some regions of the state spaces. In this situation, the Feynman-Kac 
model represents the distributions of a single Markov particle model evolv- 
ing in an absorbing medium with hard obstacles (see Section 2.4.3, Sec- 
tion 2.5, Section 3.3 and Section 4.4). Let En = En~ Gj^*(0). We recall 
(see Section 2.5, page 71) that the limiting flow is well-defined only up 
to the first time t we have fjriEr) = 0(= rir(Grlg )); that is, up to the 
deterministic time horizon 

T = inf{n€N : 7 „+i(l) = 0} 

= inf {n e N : t/„(G„) = 0} € (0, oo] 




220 7. Asymptotic Behavior 



Note that 

T = 00 <=» Vn € N 7„(1) > 0 

As an aside, since Gn are assumed to be [0, l]-valued potential functions, 
we find that 

0<7n+l(l)<7n(l)<l 

Consequently, we have 7n(l) > 0 if and only if we have 7 p(l) > 0 for any 
0 < p < n. Next we present two sufficient conditions under which the nor- 
malizing constant 7n(l) is well-defined for any n € N. 

(^) For any n > 1 and x„ € En, »/o(^) > 0 and Af„+i(x„, S„+i) > 0. 

(B) There exists a sequence of positive numbers a(n) such that for any 
n > 0 and € En we have 

and M„+i(xn,^n+i) > 1 - 
It is easily checked that (B) =» (^) => r = oo. 

When En = Eny condition (B) holds true for any choice of a(n). Also 
recall that if (^) is met, the updated distribution flow model % = ^n{Vn) 
can be regarded as the prediction flow model with i^ial distribution r}o and 
associated with the pair potential/transition (Gn, Afn) deflned in Proposi- 
tion 2.5.2 on page 70. In other words tmder (A) the analysis of the updated 
flow fin reduces to that of a prediction flow with strictly positive potentials. 

In accordance with previous comments, the 7V-interacting particle sys- 
tems ^ 

^ ^ ^n+1 

associated with a general class of Feynman-Kac models with [0, l]-valued 
potential functions are only deflned up to the time = n the whole 
configuration ^n € E^ first hits the hard obstacle set {En - En)^: 

= inf {n € N : m(C„)(Gn) = 0} € [0, oo] 

These stopped algorithms are defined as Markov chains taking values in 
U {A} where A = {d,...,d) {N times) represents a cemetery point and 
Cp = Cp+i = for *“ry p > Notice that we have ^„ = A if and only if 
5 for all 1 < i < AT. We refer the reader to Section 3.3 for a precise 
construction of the probability space associated with these models. 

It follows firom the definition of that 

T^ = n |o€Bo,...,f„_i €B„-i and = A 

^ Co€Bo,...,Cn-i €B„_i and 

and > n •<=>• G Eq, . . . ,^n-i € B„_i. This indicates that is a 
predictable Markov time with respect to the natural filtration asso- 
ciated with the Markov chain in the sense that {r^ = n} € and 




7.3 Inequalities for Independent Random Variables 221 



{t^ > n} e ^n-i- In this context, the iV-particle density profiles associ- 
ated with the Feynman-Kac flows (7n.»?n) are given by 

t=l 

T^(-) = ■iJ'(-) X n<(G,)€M+(E.U{9}) 

p=0 

Since test functions /« € Bh{En) are extended to U {9} by setting 
fn{d) = 0, we shall identify the null measure on E„ U {d} with the Dirac 
measures In these conventions, we have 

Ir^<n ^ Vn ~ ~ ^ 7n 

More interestingly, in the event that > n, the JV-particle model € E^ 
at time n has not been killed and we have 

lr^>n X »?n e M+{En) and 1,.N>„ X 'Tn ^ Af+(E„) 

When no confusion can be made, we shall clarify notations suppressing 
the superscript (.)^ the expectation operators. For instance, we shall 
often denote by E(.) instead of E^(.) the expectation operator associated 
with the distribution of an iV-particle type model. 



7.3 Inequalities for Independent Random Variables 

Let be a sequence of probability measures on a given measurable 

state space {E,€). We also consider a sequence of f -measurable functions 
such that = 0 for all i > 1. During the further development 
of this section, we fix an integer iV > 1. To clarify the presentation, we 
slightly abuse the notation 6md we denote respectively by 

= and 

t=l i=l 

the iV-empirical measure associated with a collection of independent ran- 
dom variables X — (A'‘)i>i, with respective distributions {fH)i>u and the 
i^-averaged measure associated with the sequence of measures (/Xi)i>i. To 
clarify the presentation, when we are given Af-sequences of points x = 
(®*)i<«<N G E^ and functions (/»«)i<»<iv G Bb{E)^, we shall often use the 
abusive notations 



,N . N 

m{x){h) = 08 c*(fit) 

t=l i=l 




222 7. Asymptotic Behavior 



For any pair of integers (p, n) with 1 < p < n, we denote by 

(n)p = n!/(n-p)! 

the number of one-to-one mappings from a set of p elements into another 
set of n elements. 

7.3.1 Lp and Exponential Inequalities 

We start with two elementary and well-known results. 

Lemma 7.3.1 Let X be a real-valued random variable X with a<X<b 
and E(X) = 0. Then for any t>0 

E(e‘^) < 



Proof: 

To prove this assertion, we use the convexity property of the exponential 
function to check that for any x € [a, 6 ] 

gti _ gto ^ gt6 _ gto 

x-a ~ 6-0 



or equivalently e‘* < e** -|- This convexity inequality readily 

bounds the moment-generating function of a random variable in terms of 
its mean and support 



E(e‘^) < 



E(X)-o ,, b-E(X) 
6-0 "^ 6-0 ® 



= 



0 — a 



^ gV>(t(6-o)) 



(7.5) 



with 

V>(s) = d s-|-log[l -l-d (1 -e*)] and d = ° 

(6-0) 

By stradghtforward calculations, we find the derivatives 



(p'{s) = d + 



d 

d- (1 -|-d)e“* 



and <p"{s) 



d(l-i-d)e * 
(d-(l-hd)e-»)2 



Since d(l -t- d) = o6/(6 - o)^ < 1/4, using Taylor’s formula we find that 
2 

and by (7.5) we end the proof of the lemma. ■ 

By Markov’s inequality, for any random variable U and for any pair s, t > 0, 
we have 



P{U >s) = P(e‘^ > e*‘) < e-* ‘ E(e‘^) 




7.3 Inequalities for Independent Random Variables 223 



This inequality is also known as Bernstein’s exponential bound. Chernov’s 
method consists in finding the parameter t > 0 that minimizes the upper 
boimd. 

In the context of the sum of N independent random variables X* with 
respective distributions /i*, the Bemstein-Chernov inequality yields that 

N 

?{N m{X){h) > s) < e"*‘ < e"' 

t=l 

(recall that fii{hi) = 0 and a^{h) — osc^(/ii)). By choosing t — 

4s/{Na^{h)) and s = Ne, we conclude that for any e > 0 

P{m{X){h) > e) < e-2^ 

In the same way, we prove that P{-m{X){h) > e) < This 

readily implies the following lemma. 

Lemma 7.3.2 (Chernov-Hoeffding) 

P(|m(X)(/i)| > e) < 2 

These exponential boimds were originally proved by Chernov in 1952 for 
binomial distributions and extended to generd bounded random variables 
in 1963 by W. Hoefiding. 

The next lemma is a complement of the inequalities of Khinchine, Burkholder, 
Davis and Marcinkiewicz-Zygmimd. 

Lemma 7.3.3 The following assertions are satisfied for any sequence of 
S-measurahle functions (/ii)i>i such that fii{hi) = 0 for oW i > 1. 

• If the functions hi have finite oscillations, then for any p>l we have 

y/N E(|m(A-)(/i)P|)i < d{p)i a{h) (7.6) 

with the sequence of finite constants (d(n))„>o defined for any n > 1 
by the formulae 

d(2n) = (2n)„ 2"” and d{2n - 1) = (7.7) 

V« - 1/2 

• If we have ^(/i*") < oo for some n > 1, then 

iV” E(m(X)(/i)2”) < d{2n) fx{{2hf^) 

E(|m(X)(h)|2"-i) < d(2n - 1) fi{{2h)^^)^~^ 




224 7. Asymptotic Behavior 



As we mentioned in the introduction, tins technical lemma will be of use in 
this chapter, including in Lp-mean errors, in increasing strong propagation- 
of-chaos analysis and in the derivation of a Berry-Esseen inequality for 
particle models. In this context, the use of Burkholder type estimates will 
lead to different conclusions and very coarse properties. 

There are a number of significant and related estimates in the literature 
on martingales that apply to our context. For instance, using Burkholder’s 
inequality (see for instance [287]), we would find that 

Pr E{m{X){hf^) < ( 18 B 2 n)^ 

with (2n) < B2n = 2ny/n/{n - 1/2) < y/2 (2n). This would lead to the 
estimate 

AT E(m(X)(A)*") < 2" 18*’* (2n)*" a(A)*’* 

The next inequality gives a quick and simple way to measure the improve- 
ments obtained in Lemma 7.3.3: 

= I IT fl - < 

2” 182" (2n)2" 6^" (2n)" V 2nJ ~ 6^" (2n)" 

On the other hand, for homogeneous pairs (At, /it) = the central 
limit theorem applies and we have the asymptotic result 

{VN m(X)[A/||A||2.p])*" ^ 

where IV is a centered and Gaussian random variable with £(1^2) = 1 and 

the superscript stands for the convergence in distribution as N tends to 
infinity. In this connection, if we have /i(A*") < oo for some integer n > 1, 
then it is well-known that 

lim AT* E(m(X)[A/||A|| 2 ,p])^’‘ = E(W^*’*) = (2n)„ 2"’* 

N -¥00 

This asymptotic result already indicates that in this sense the estimates 
presented in Lemma 7.3.3 are sharp. As a final illustration of the impact 
of these inequalities, we provide hereafter an estimation of the moment- 
generating function of the empirical measures m(A'). 

Theorem 7.3.1 For any sequence of £ -measurable functions (Ai)i>i such 
that Hi{hi) = 0 , for alli>l we have for any e > 0 

a{h) < 00 =► E(e''^l’"(-^)('*)l) < ^1 -t- ^ <r(A)^ 




7.3 Inequalities for Independent Random Variables 225 



Proof: 

The Ln-inequalities stated in Lemma 7.3.3 clearly imply that for any e > 0 

n>0 ' 

n>0 ' ’ 

< vi 

- 4^n! \ 2N ) 4^n! \ W ) 

n>0 ' ' n>0 ' ' 

from which we conclude that 




We end the proof of the theorem by replacing e by esfN. ■ 

Proof of Lemma 7.3.3: We first use a symmetrization technique. We con- 
sider a collection of independent copies X' = of the random vari- 

ables X = (X*)j>i. We also assume that (X, X') are independent. As usual, 
we slightly abuse the notation and we denote by m(X') the 

TV-empiricd distribution associated with X'. We observe that 

m(X)(/i) = E(m(X)(/i) - m(X')(/i)| X) 

This clearly implies that, for any p > 1, we have that 

E{\m{X){h)n < E(|m(X)(h) - m(X')(h)|»>) 

We first examine the case p = 2n with n > 0. In this situation, we have 

Af2«E(|m(X)(/i) - m(X')(h)P") 



=E E 



fc=l Pi+--+Pfc=2n 



(2n)! 

pi!...pfe! 



a€<fc,Ar> «=1 



where 53p,+. .+pi,= 2 n indicates summation over all ordered sets of strictly 
positive integers p< > 1 such that pi -f- . . . -t-p* = 2n, and {k, N) is the set of 
all one-to-one mappings from {k) =def. into (N). Since we have 



E([/i,(X^) - hi(X'^)]n = -E([h^(X^) - hj{X'i)r) = 0 




226 7. Asymptotic Behavior 



for any l<j<N and any odd integer p, we check easily that 
iV2’»E(|m(X)(/i) - m(X')(/i)P") 



n 



=E E 

fc=lPl+...+Pfc 



(2n)! 

(2pi)! . . . (2pfc)! 



oe(fc,Ar) \<=i 



<(2n)„(sup sup ][[(2pi)p/') E((X)jIi[h<(A'‘)-hi(A’'‘)]2)") 

\l<fc<npi+ ...+pik=n^* y 



Using the fact that for any p > 1 we have 



(2p)p 



(2p)!/p! = 2p(2p-l)...(2p-(p-l)) 

f[(p + k)>2^ 

fc=l 



we conclude that 

N^E{\m{X){h) - m{X'){h)\^^) 



< (2n)„ 2-” E ((i - /W(X'‘)]2)”) 

and therefore 



Ar'*E(|m(X)(h)|2") < (2n)n2 








This implies that 



iV"E(|m(A-)(h)|2") < (2n)„ 2-" 



as soon as a{h) < oo. In the same way, if we have < oo, then 

AT"E(|m(X)(h)|2’*) < (2n)„E((m(A-)(h2) + m(r)(h2))”) 

< (2n)n 2” E(m(X)(/i2)") < (2n)„ 2” 



For odd integers p = 2n + 1, we use the Cauchy-Schwartz inequality to 
check that 



E(|m(X)(A)p"+‘)2 < E(|m(X)Wr) E(|m(X){ft)|«“+») 




7.3 Inequalities for Independent Random Variables 227 



Erom previous estimates, we find that 

E(|m(X)(/i)|2’*+i)2 < (2n)„ (2(n + !))„+, a(/»)’( 2 n+i) 

as soon as (r{k) < oo. Since 



(2(n + l))n+i 
(2n)„ 



we get 



(2(n + l))! 2n + l! 

(n + 1)! n! 

2n! 1 2n + l! 

n! 2n + 1 n! 



= 2 (2n + l)n+i 

(2n + l)n+i 
(2n + l) 



E(|m(A’)(/i)l*’*+‘) < + 2 -{"+V 2 ) 

Vn + 1/2 

In the same way, for any h such that < oo, we have 

E(|m(A')(h)|’”+‘)2 < 22"+* /i(fi2")/i(h2(n+i)) 



Since 

/l(fi2");i(/l2("+*)) < ^(/i2(«+l))2-str 

we conclude that 

iV"+*/2E(|m(X)(h)|2"+*) < 2"+*/^ ^(ft2(n+i))i-^ 

\/n + 1/2 

and the proof of the lemma is now completed. ■ 



7.3.2 Empirical Processes 

Let .F be a given collection of measurable functions f : E -^R such that 
ll/ll < 1. We associate with T the Zolotarev seminorm on V{E) defined by 

llAt - v\\jr = sup{|^(/) - v{f)\] f € .F}, 

(see for instance [276]). No generality is lost and much convenience is gained 
by supposing that the unit constant function / = 1 € F*. Furthermore, to 
avoid some unnecessary technical measurability questions, we shaU also 
suppose that F is separable in the sense that it contains a countable and 
dense subset. 

To measure the size of a given class F, one considers the covering numbers 
N{e, F, Lp{fj)) defined as the minimal number of Lp(/i)-balls of radius e > 0 




228 7. Asymptotic Behavior 



needed to cover T. By e > 0, and l?y /(^) we denote the uniform 

covering numbers and entropy integral given by 

= snp{M{e,J^,L2{v));r,^V{E)} 

lin = fy/\og{l+M{e,J^)de 
Jo 

Various examples of classes of functions with finite covering and entropy 
integral are given in the book of Van der Vaart and Wellner [311] (see 
for instance p. 86, p. 135, and exercise 4 on p.l50). The estimation of the 
quantities introduced above depends on several deep results on combinar 
torics that are not discussed here. To illustrate these covering numbers, we 
content ourselves with mentioning that, for the set of indicator functions 
^ ~ ^ in jE? = R**, we have 

AT(e,.F) < c{d+ l)(4e)‘'+^ 

Since log(l/e)dc < oo, we readily check that I{T) < oo. The expo- 
nential estimates and the Lp-mean errors discussed hereafter will depend 
respectively on and /(.F). Although it is usually cltumed in the 

Monte Carlo literature that the convergence of Monte Carlo methods is 
dimension-free, previous considerations clearly indicate that this assertion 
is far from being true for empirical approximation processes. 

Let (^n»^n)n=o,i be a p«dr of measurable spaces and let T C Bb(Ei). 
Also let M be a Markov kernel from {Eo,So) into (Ei,Ei) and G : Eq-^R 
an £o-measurable function with ||G|| < 1. We associate with the triplet 
(.F, G, M) the collection of £o*nieasurable functions 

GMT = {G- M{f)\ f€T}C Bb{Eo) 

Lemma 7.3.4 For any p>l,e>0, and v € V{Ea), we have 
J^{e,G-MF,Lp{v)) < AT(e, .F, Lp(i/M)) 

Therefore we find that 

A/’(£, G-MF) < U{e, F) and I{G • M.F) < 7(:F) 



Proof: 

Lemma 7.3.4 follows from the fact that 

A7(£, G-:F, Lp(i/)) < Mie, F, L^iu)) 
fif{e, MT, Lp{p)) < N{e, F, Lj,{vM)) 

The first assertion is obvious. To establish the second inequality, simply 
note that, for every frmction /, |M(/)|*’ < M{\f\^) *md go back to the 
definition of the covering numbers. This ends the proof of the lenuna. ■ 




7.3 Inequalities for Independent Random Variables 229 



Lemma 7.3.5 For any p>l, we have 

N/iVE(||m(X)-HI5:)"<cb/2]!/(:F) 



Proof: 

We consider a collection of independent copies X' = (A''’)i>i of the random 
variables X = (X*)<>i. Let e = (e<)i>i constitute a sequence that is inde- 
pendent and identically distributed with P($i = -f 1) = P(£i = -1) = 1/2. 
We also assume that {e,X,X') are independent. We associate with the 
pairs (€,X) and {e,X') the random measures mdX) = ^ “id 

mt{X') = ji Notice that 

||m(X)-/i||5, = sup |m(X)(/)-E(m(r)(/))|'’ < E(||m(A:)-m(X')||5: \X) 

and in view of the symmetry of the random variables {f{X') - f{X''))i>i 
we have 



E(||m(A') - m(X')||5.) = E(||m,(X) - m,{X')\\%) 

This implies that 

p(lHX)-/iii5.)<2i'f;(|K(x)ii5.) 

By using the Chemov-HoeflEding inequality for any x^,...,x^ 6 E, the 
empirical process 

/ — yy/N mc{x){f) 

is sub-Gaussian for the norm ||/||ij(m(i)) = Namely, for any 

f^g £ T and S > 0 

P \rris{x){f) — rric{x){g)\ >S^ <2 /H/”^llL 2 (m(*)) 

Using the maximal inequality for sub-Gaussian processes and the fact that 
0 G we arrive at 

y/N n,p{\\meix)y) < c jf ^log (1 + V{e, T, || . ||L,(m(x)))) ^ 
where 

• 7T^(y] is the Orlicz norm of a random variable Y associated with the 
increasing convex function ^(u) = e“ -1 (andV'“*(«) = \/log(l -fu)) 
and defined by 

7rv,(y) = inf{oe(0,oo) : E(^(|r|/c)) < 1} 

• IMli, 2 (<i)) is the maximum number of e-separated points in 
the metric space (P, l|.||i 2 (^)) and c is a imiversal constant (see for 
instance Corollary 2.2.8 in [311]). 




230 7. Asymptotic Behavior 



On the other hand, by a simple calculation, we see that 

'D{2e,T I ||•||I, 2 (m(x))) ^ ll•ll^.a(m(*))) ^ ®tip A/" (c, ^, ||. ||l,j(n»(v))) 

y^E" 

Recalling that E(|y|*’)*/*’ < [p/2]! 7r^(y), for p > 1, we arrive at 

VJV E(||me(X)||5.)'/'’ _______ 

^ C [p/2]!/o“ supygf;^. ^og(l + JVd,^, IMka(m(v)))) de 

and therefore 



E (||m(;f) - 4 J)'" 

£ |p/2|l IT “"PtsE” ,/>“«(* ll-lltiW,)))) * 

Under our assumptions, if e is larger than 1, then ^ fits in a single Lnifi)- 
ball of radius e aroimd the origin, for any p G ViE). The end of the proof 
is now straightforward. ■ 



Lemma 7.3.6 For any e > 0 ond VN > 4e we have that 



P(||m(X) - nWr > 8 e) < 8 A/'(e,.F)e-^**/2 



Proof: 

Using classical symmetrization inequalities (see Lemma 2.3.7 in [311] or 
pp. 14-15 in [271]), for any e > 0 and VN > 4e“*, 

P (j)m(X) - p]|^ > e) < 4P(||me(X)||^ > |) (7.8) 

where mt{X) denotes the signed measure mg{X) = ^ and 

{fi, . . . , e;v} are symmetric Bernoulli random variables, independent of the 
X*’s. (Conditionally on the X*’s, and by the definition of the covering num- 
bers, we easily get by a standard argument that 

P(K(X)||:r><J|X) 

/ \ (7.9) 

<M{S/2,T,LHm{X))) 8upP(|m,(X)(/)| > 5/2 jx) 

Indeed, let {/p; 1 < p < Af[6/2,^,L^{m{X)))} be a (5/2)-coverage of 
for the L^(m(X))-norm. Then 

P(||me(X)|l^ >S\X)< p(sup|me(X)(/p)| > 6/2 \ x) 




7.4 Strong Law of Large Numbers 231 



Therefore, 

P^lmeWlUx^lA-) 

<M{6l2,T,L\m^{X))) 8upP(|me(A-)(/p)| ><J/2|x) 

By the Chemov-HoeflFding inequality, for any f and <5 > 0, 
p(|m,(X)(/)| >(J/2|X) < 

As a consequence, we see that P^||me(X)||^ > 1 1 is bounded above 
by 

2A/’(e/8,:F,Li(m(X))) < 2 A/'(e/8, 

FY^om (7.8), it follows that 

P (l|m(X) - 4j, > £) < 8 N{e!%,T) e-^'Vi 28 

as soon as y/N > 4e“\ which is the result. This ends the proof of the 
desired estimate. ■ 



7.4 Strong Law of Large Numbers 

74-1 Extinction Probabilities 

The objective of this short section is to estimate the probability of ex- 
tinction of a class of particle models associated with potential functions 
that may take null values. The forthcoming developement is valid for any 
McKean interpretation model of the form 

En+l,tf — 

where Sn,t) is & selection transition satisfying the compatibility condition 
T}Sn,,f = for any distribution t] such that t/(G„) > 0. We also require 

that Sn,n = 1, as soon as > 0, with £„ =def. G~^(0,oo). 

These technical requirements are clearly met in the two cases examined on 
page 219. 

The analysis of the extinction probabiUty arises for instance in physics 
when the particle model evolves in an environment with hard obstacles. 
For discrete time models, it may happen that at a given date all particles 
enter into a hard obstacle. At that time, the system dies and the algorithm 
stops. 




232 



7. Asymptotic Behavior 



The antdysis of these stopping times is far from complete, and many 
questions remain to be answered. In this section, we content ourselves with 
proving the following rather crude but reassuring result. 

Theorem 7.4.1 Suppose we have 7n(l) > 0 for any n > 0. Then, for any 
N > I and n>0, we have the estimate 

P(r^ < n) < a(n) 

In addition, when assumption {B) is satisfied for some collection of positive 
numbers (o(n))„>o, then for any n>0 we have 

<n) 

p=0 



Proof: 

Let Clffin + 1) be the set of events defined by 

Qff{n + 1) = {VO < p < g < n + 1, \t]^ (Q p,ql) - Vp(Qp,q^)\ < 7«(l)/2} 

By the definition of we have = 7p(<3p,fll)/7p(l) = 7,(l)/7p(l) 

Since 7 p(l) < 1, we fed that »/p(Qp,gl) > 79(1)- For the set of events 
njv(n + 1), the following inequalities hold true for any 0<p<q< (n+ 1): 

0 < ^ < t?p(Qp,,l) - ^ < <(Qp.,l) < r,p(Qp,ql) + ^ < 2 

Consequently, for ilff{n + 1) we have r}^ {pp) > > 0 for any 0 < p < 

n. This yields the inclusion njv(u + 1) C { > n }. On the other hand, 

we notice that flAr(n + 1) = U with 

= {VO < p < n, \rf^ {Qp, n+il) ~ J/p(Qp,n+il)| < 7n+i(l)/2} 

For n = 0, we also fed that Hn{1) = {\r)o{Go) - »/o(Go)| < 7i(l)/2}. 
By the definition of tiq (and since osc(Cro) < 1), using the Chemov- 
Hoeffding inequality (see Lemma 7.3.2), we prove that P(nAf(l)) > 1 - 
2exp (— JVti(1)/2). To go a step further, we use the decomposition 

P(fi//(n + 1)) = F{tls{n) n n^(n)) 

= P(fiAr(n)) - P(fiAr(n) n fl'j^(n)) 

with 

^w(^) ~ {3® ^ P ^ ^ > \Vp (Qp,n+ll) ~ ^p(Qp,n+ll)| > 7n+l(l)/2} 

It follows that P(nAr(n+l)) > P(f^iv(n))-I2p=oF(fiAr(n)nft^(p,n)) with 

== {l*7^(Qp,n+il) — 77p(Qp_„+il)| > 7 „+i(1)/2} 




7.4 Strong Law of Large Numbers 233 



For the set of events ilsin), we have for each 0<k <p<n 
»?if'(Qfc.pl)>7p(l)/2>0 

Thus, for the set Usin), we have for each 0 < k <p <n 

jfk f N 1 V. pf //^ /i\\ \ 7p(f) 

^kivk-i)iQk,pi) = —jrir — ^ 

Vk-iK^k-l) 2 

and foT k = p and /; = 0 we have respectively 

^p.p(^) ~ ^ ^p(Vp-i)(Qp,p^) = 1 ^ ^**2 

and ^o(V-i)(Qo,pi) ~ ^(QoA) ~ '7p(^) ^ Observe that for the 
(ipf(n), we have the decomposition 

Vp-Vp = '^ l^k,p(Vk) - ^k,p(^k(Vk-i))] 
k=0 



set 



with the conventions $p,p = Id and = % for fc = p and k = 0. 

In addition, we have for each 0 < k <p, f € Bb{Ep), and f?i,Tj 2 € V{Ek), 
such that r;i(Qfc,pl) > 0 and %(Qfc,pl) > 0 

^kAm)U)-^kArn){f) = K>?i(Qfc.p/)-%(Qfc.p/)) 

+*fc,p(»7i)(/) (%(Qfc,pl) - »?i(Qfc,pl))] 
We conclude that, for the set of events ilsin), we have for any 0 < p < n 



Vp (Qp,n+ll) ~ Vp{Qp,n+lI) 



y 1 

fc_0 ^*(*lfc^l)(^fc.pl) 



[(>?f (Qfc,n+ll) - ^kiVk-l){Qk,n+li)) 



+^k,p{Vk)iQp,n+l^) i^k{T)k-i){Qk,pl) - Vk (Qfc,pl))] 

This yields that (for the set nAr(n)) for any 0 < p < n 

1*7^ (Qp,n+ll) ~ t?p(Qp,n+ll)l 
S T7n E - ♦t(»J'-,)(«».n+il)l 



+\ril^{Qk,pl)-Mvi^-i)iQk,pl)\] 




234 7. Asymptotic Behavior 



It also follows that for the set ilN{n)n n) there exists some index fc, 
0 < k <p<riy such that 

^ > 7n-n(l)7p(^) ^ 7n-n(^) 

IVk (Qfc,n+il) ^k(r/k-i)(Qk,n+il)l - 8(p + 1) ” 8(n + 1) 

and j^k(rik-i)(Qk.pl) ~ rjk (Qk, pl)l > K it wiU not be the case, we 

will have a contradiction. We conclude that for any 0 < p < n 

P(ilf/(n)DQ'Jif(p,n)) 

< [p (njv(n) n (Qfc,n+ii) - n+ii)\ > ^|^}) 

+p n jlp^'cgfc.pi) - ^kivi!-i){Qk.pi)\ > ^^})] 

To end the proof, we recall that Qyin) C {r^ > n} C > k} for any 
0 < fc < n so that 

p (n^(n) n {\vif{Qk,n+ii) - MvH-i)iQk,n+ii)\ > ^}) 

<p(r^>fc and \vj^ iQk,n+il) ~ Mv^-i){Qk,n+il)\ > 

< E (p ({|p£'(Qit.n+ll) - $fc(»?£Li)(Qfc,n+ll)| > 1^^} I vl^-l) lr^>k) 

< 2 exp ^—2N (7^+i( 1)/[8(« + l)osc(Qfc.„+il)])') P(r^ > k) 
and since osc(Qfc,„+il) < 1 

P (n«(n) n {|,,"((3,,„„1) - 

<2exp(-g(75,.,(l)/(n+l))^) P(t" > i) 

The last displayed estimates are proved by using the Chemov-Hoeffding 
inequality. This readily implies that for each 0 < fc < n 

P (n«(n) n - *»(i£'-i)(<3*.nta)l > ^}) 

< 2exp(-g(Tj+i(l)/(n + l))*) 




7.4 Strong Law of Large Numbers 235 



and similarly for each 0 < fc < p < n 
p (Sj„(n) n {K"(Ot,pi) - 

< 2exp (l)/(n + l))“) 

Using these two upper bounds, we find that for any 0 <p< n 

P(fi^(n) n ilj^(p,n)) < 4(n + l)exp (7n+i(l)/(« + 1))^) 
and finally 

P(n„(n+1)) > P(fiK(n))-4(n + l)"exp(-^K„{l)/(n + l))’) 

> P(nN(l)) - 4(n+ l)>«p (-,;+,(l)/(n + 1))') 

> l-8(n+l)’cxp^-^(T;,,(l)/(n + l))*^ 

This ends the proof of the first assertion. We prove the second one by a 
simple induction argument. First we observe that 

P(t^ > n) = P(t^ > n - 1 and (G„) > O) 

= E (p (a 1 < i < iV s.t. C e lr^>„-i) 

= P(t^ >n-l) 

- E (p (v 1 < t < iV lrN>n-l) 

By the definition of the particle models, we have 

p(vi<i<iv 

N 

«=1 

Since, for any t] € V{En) with rj{En) > 0, we have 5„,, ^Xn, E„j = 1, for 
any x„ G E„, we readily get the estimate 5„_i,,,Mn ^Xn-i,E„ - E„J < 
e““("), for any Xn-i € E„-i, from which we conclude that 
P (t^ > n) > P (r^ > n - 1) - P (t^ > n - 1) 

>P(t^ >n-l)~ ^ 1 - E 

p=0 



and the end of the proof of the theorem is completed. 




236 



7. Asymptotic Behavior 



l.J^.2 Convergence of Empirical Processes 

This section contains several martingale decompositions for a general class 
of particle approximation Fe}mman-Kac models. These key martingales will 
provide precise estimates on the convergence of particle density profiles 
when the size of the system tends to infinity. They also introduce martingale 
techniques and stochastic calculus tools into the numerical analysis of these 
algorithms. The asymptotic analysis presented in this section is valid for 
any McKean interpretation model statisfying the technical requirements 
stated in the beginning of Section 7.4.1. 

We start with the walysis of the unnormalized particle models and we 
show that this approximation particle model has no bias. The central idea 
consists in expressing the difference between the particle measures and the 
limiting Feynman-Kac ones as end values of martingale sequences. These 
natural martingales are built using the semigroup structure of the unnor- 
malized Feymnan-Kac fiow. We also examine the consequences of this result 
in the estimation of the extinction probabilities and in the analysis of the 
normalized particle model. 

Proposition 7.4.1 For each n > 0 and fn G Bb {En), we let r^_„(/„) be 
the R-valued process defined by 

r'^,n(/n) • P € {0> • • . » n} — > (fn) = 7^ {Qp,nfn) ^r">p ~ 'Ip {Qp,nfn) 

For any p< n, r^„(/) has the F^ -martingale decomposition 



^p,nifn) — (^) (Qq.nfn) {Qq,nfn) 



q=0 



and its angle bracket is given by 



(7.10) 



(r^,n(/n)) 



P 



~ Vq-l{Eq^flN_^[Qq,nfn 

7=0 



with the convention for q = 0, rj^fi = $ (V-i) — = %• 



(7.11) 



Before getting into the proof of this proposition, it is interesting to make 
some remarks: 



• We first observe that these martingale decompositions provide sharp 
estimates of mean error between the particle approximation mea- 
sures 7 ^ and the unnormalized Feynman-Kac measures 7„. Indeed 




7.4 Strong Law of Large Numbers 237 



by (7.11) we have that for any n € N and fn G Bh{En) 
S^Po<g<n ^ ®([7^ {Qq,nfn) ^r^>q ” 7g {Qq,nfn)]^) 



9=0 

(7.12) 



• The estimates presented in Proposition 7.4.1 also allow us to initiate 
a comparison between the particle approximation models associated 
with the two cases introduced on page 219. In the first situation, we 
observe that for any q>l,rj€ V{Eq-\), and Bb{Eq), we have 

= ^qiv)[v> - 

while in the second one 
r,Kq,qy-Kq,r,<p]^ 



= - V[^g{v)iv>) - 

= ^giv)[<P - ^g{v){f)? - 

< ^giv)[<P - ^gin){'p)? (7.13) 

This simple observation indicates that the particle model in the sec- 
ond case is more accurate than the other one. For instance, suppose 
that the potential functions reduce to Gn = 1 and the mutation tran- 
sitions are “trivial” in the sense that = {E,Id). In this 

rather degenerate situation, we have in the first case 



= Voiif - Voif)?) + ES - <W) 

Using the fact that 

niHw) = io(/) 

after some elementary manipulations, we find that 

N E{[r,^{f) - T,n(/)]2) = ,^(1/ - ("l + ^(1 " Jjf 

\ P=0 

In the second case, we have A'n.ry = 7d, and therefore 

^ e ([ 7 ,^^(/) - um = %((/ - %(/)]") 




238 



7. Asymptotic Behavior 



• The third, more classical observation is that we can reverse the expec- 
tation and the supremum operators in (7.12) by a simple application 
of Doob’s maximal inequality. More precisely, for any p > 1, we have 
that 

e(sup ir^„(/„)r)' < ^ E(ir^;„(/„)ni 

\0<9<n / P-1 

= ^ E(| 7 ^(/„) - 7n(/n)r)i 

(7.14) 

To prove this traditional martingale inequality, we use the fact that 
for any nonnegative random variable U and for any p > 0 we have 

E{m)=p r tf>-^?{U>t)dt 
Jo 

If we set U* = 8upo<,<„[/,,„ and t/,,„ = |r^„(/„)| then we readily 
check that {Uq,n)q<n is an (/■^),<„-submartingale and by Doob’s 
maximal inequality we have for any t > 0 



t m* >t)< E(C 7 n.n 

This yields that 

E((t/„T) = P r t^-^nUn>t)dt 
Jo 

< pE lui>f dt 



= pE|^c/„.n jj" e 

Now, by Holder’s inequality, we conclude that 

mu*n)n < 

which ends the proof of (7.14). Using Lemma 7.3.3, we immediately 
obtain the crude estimate 

v/WE( sup 2 osc(Q,.„(/„)) 

0<o<n P - 1 “ 



dt 



< ^E(U„, „(£/*)»’-') 



0<9<n 



9=0 



Proof of Proposition 7.4.1: We use the decomposition for each tp G 

p 

7p (^) It''^>P ~ 7p (^) ~ ^^ [ 7 ^ (Q9,P^) ~ “Yfl — 1 (Q9-I,pip) 

9=0 

(7.15) 




7.4 Strong Law of Large Numbers 239 



with the convention for g = 0, 7^1 (Q_i,pv?) lrJv>-i = Observe 

that 

Iq {Qq,pf) ^T^>q — 'y^{^) ^ Vq {Qq,p'P) (7-16) 

and 

yq-l (Qq-l,p'fi) ~ lq-1 {Gq-lMq (Qq,pip)) It''' >fl-l 

= 7^-lW X V^.l(Gq.lMq(Qq,p<p)) 

Since 1 = 1,jW_,(G)= 0 + l»jJ'_,(C)>0> l»?^_j(G)>0 ^ lr'''>9-l = lr">9) 

(Gq~lMg (Qq,pip)) = 0 

we conclude that 

7q—l (Qq-l,P*P) ^rf*>q-l ~ 7q—\ ( 1 ) ^q-\ (Qq (Qq,p*P)) 

= 7q^(l) lr->q^q(Vq^-l)(Qq,p<p) 

If we set V? = Qp,n(f)) for some / € Bb(E„), we find that 
7p {Qp,nf) ^t'*>p ~ 7p{Qp,nf) 

= j^lq (1) lr->, [< {Qq,nf) ~ {Qq,nf)] 

q=0 

The end of the proof is now clear. ■ 



Theorem 7.4.2 For eachp> 1, n 6 N, and for any (separable) collection 
Tn of measurable functions / : -► R such that ||/|| < 1 (and 1 G Tn), 

we have for any f £ Fn 



EW (/)!.»>„)= •»«(/) (7.17) 



and for any r <n 

^ E(|llTN>T7r''<3r,n-7rQr,n||^J*/'’< C (n + 1) (p/2)! /(:T„) (7.18) 
In addition, for any e> 4/ y/N, we have the exponential estimate 

p(||lT7.>T7r"'Qr,n-7rQr.n||;r„ >e) <8(n + l)^'(e„,:F„)e-^'n/2 (7.19) 

with e„ = e/{n+ 1). 




240 7. Asymptotic Behavior 



Proof: 

The first assertion is a simple consequence of Proposition 7.4.1. Using the 
martingale decomposition (7.10), we find that 

||lr^>r7r^ Qr,n “ 7rQr,n||T'n 

<E;^ <(1) lr»>, 

with Tq,n = {Qg,n{f) ' f ^ ^n)- This implies that 
E(||l,N>,7/'Q.,n-7rgr.n||^J'/'’ 

< E;=o 

By Lemma 7.3.5, we have for any r < n 

v/iV E(||„f - . Il^,„ I lr->r < C b/2]! /(J^r.n) 

Now, by Lemma 7.3.4, we find that /(.7v,n) < I{^n) and we conclude that 

yfN E(||l^N>,7r^Qr.n ~ 7rQr.n||^j'/'’ < C (n + 1) [p/2]! I{Tn) 
Using the inequality (7.4.2), we prove that for every e > 0 

P(lllT^>r7r^ Qr,n ” 7rQr,nl|,Fn > (^ + 1)^) 

< (n + 1) sup P(lri»>, y,.n > e) 

By Lemma 7.3.6 and Lemma 7.3.4, we have for any q <n the exponential 
estimate 

P(lr«>, 111" >E I <8Af(£,7;)C-'"’« 

as soon as > 4 e“^. This clearly yields that 

P(||lrN>r7r^Qr.n ~ 7rQr.n||:r„ > (n + l)s) < 8(n + 1) AT{e, Tn) 
and the proof of the theorem is completed. ■ 

Corollary 7.4.1 For any p>l, we have 

P (1) 1,^>„ > 7n (1) /2) > 1 - a(p) ^ 




7.4 Strong Law of Large Numbers 241 



for some finite constant b{n) < (n + 1) /7„ (1). In addition, for any pair 
(n, N) such that VN > 8/7„(l), we have the exponential estimate 

P(lrN>„ 'rHil) > 7n(l)/2) > 1 - 8(n + 1) 

with e„ = 7„(l)/(2(n + 1)). 

Theorem 7.4.3 Suppose assumption (B) (see p. 220) is satisfied for some 
constants a (n) > 0. For each n € N and for any (separable) collection 
of measurable functions / ; R such that ||/|| < 1 and 1 6 J^n, we 

have 

sup |E if) l,v>„) - ^ ^ 

^_0 

and 

(1 + /(;■„))+ 2 (7.20) 

for some finite constant b{n) < c.{n + l)/7n(l). In addition, for any 
e € (0, 1) and VN > 48/(£7„(l)), we have the exponential estimate 

9=0 

(7.21) 

with £„ = £ 7„(l)/(12(n + 1)). 

Before getting into the proof of the theorem, it is convenient to note that 
for strictly positive potential functions we have and condition 

( 5 ) is met for any a(n). In this particular situation, we have = CO 
and the estimates (7.20) and (7.21) are valid for any oi{n). Letting a(n) -> 
oo, we find that these estimates hold true without the very r.h.s. term. 
Another simple consequence of Theorem 7.4.3 is the following extension of 
the Glivenko-Cantelli theorem to particle models. 

Corollary 7.4.2 Assume that condition {B) is satisfied, and let Tn be a 
countable collection of functions f such that ||/„|1 < 1 and M{en,^n) for 
any £ > 0. Then, for any time n > 0, |llT^'>n’7^^ ” ’Inll.F converges almost 
surely to 0 as N oo. 

Proof of Theorem 7.4.3: We use the decomposition 
{Vn if) ~ Vn if)) lr'^>n = ~ 

(7.22) 




242 



7. Asymptotic Behavior 



If we set /„ = (/ - rjn (/)), then, since 7„ (/„) = 0, (7.22) also reads 

if) - Vn (/)) lr->n = (7^" (/n) lr->n ~ 7n (/n)) lrN>„ 

By Proposition 7.4.1, we have E (7^ (/„) Ir^^^n) = 7n (/n)- This implies 
that 



^{{Vn {f)-Vn{f)) lr'^>n) 

~ ® ~ “ 7n (/n)) 



~ ^ ~ 7^) “ 7n (/n)) 

~ ~ 7^^) (Tn^ “ 7n (/n))) 



If we set hn = - 1, we get the formula 



E((»l5'(/)-»ln(/))V>n) 



Let be the set of events 



-E (7^" (hn) lr^>n ~ 7n (/in)) 

^ (7n (/n) lr'''>n ~ 7n (/n)) lr'''>n) 

(7.23) 



fin = { 7 ^( 1 ) Wn>7n(l)/2} 

= {7n (1) > 7n (1) /2 and > n} 



We recall by Corollary 7.4.1 that 



pW>i- 



Kn? 

N 



with 6(n) < c.((n + l)/7n (1))^- K we combine this estimate with (7.23), 
we find that for any / 6 Bb{En), with ||/|| < 1, 



lE((»?ir(/)-»7n(/))lr~>n)| 

< |E((T,^(/)-7j„(/))lniy)|+2P((Qirr) 

< 2E (|7^^ (/in) lrN>n ~ 7n (/ln)| ifn) lr^>n ~ 7n (/n)l) + ^ 




7.4 Strong Law of Large Numbers 243 



By Theorem 7.4.2 and the Cauchy-Schwartz inequality, this implies that 
Finally, by Theorem 7.4.1, we conclude that 



|E [vH if) lrs>n - Vn if)) | < ^ + E 

q=0 

To prove the second assertion, we first observe that 

(2 >) II [Vn - Vn) lr''>nl|.T„ = lr~>n l|lT">n 7n “ 7n||.T; 

with = {(/ - Tj„(/)) : / G !Fn)- Arguing as above, we also prove that 

II (^n ~ ^n) lT'''>nl|.^n — ~ 7n||.^4 ^ 

2 w 

“ 7^^ IIV>n 7n - 7n||.F; + 2 l(nAr)c 

Since by Lemma 7.3.4 we have I{T!^ < 1 + J{f^n)i a simple application of 
Theorem 7.4.2 now yields that 

< ^ +2P((nJ)‘)i < ^ (1 +/{.F„)) 

with 6(n) < c.(n 4- l)/7n(l). A simple manipulation now gives that 



We prove the final assertion of the theorem using the inequality 

2 

Pr''>n Vn ~ VnW^n ^ l|lr'^>n 7n “ 7nlb; + 2 l(nAT)c + lrW<n 
Arguing as in the proof of Theorem 7.4.2, for every e G (0, 1) we have 

P(pTA'>n »/n -»?n||:r„ > 3e) 

- P->’'^>»» '^n ~ 7n||.r; > 0 + E(2 1(0'^)' > s) + P(lr~<n > 

= P(||l,.>„ 7^^ - 7n||;r; > 7n(l)e/2) + P((fii^r) + P(r^ < n) 




244 7. Asymptotic Behavior 



FVom previous estimations and Theorem 7.4.2, we find that 

g=0 

as soon as VN > 16/(£7„(1)) with £„ = 7„(l)£/(4(n + 1)). This ends the 
proof of the theorem. ■ 



7.4 3 Time-Uniform Estimates 

This section is concerned with the long time behavior of iV-particle approx- 
imation models. The uniform estimates presented in this section are valid 
for any McKean interpretation model. 

Our strategy will be to connect this problem with the stability properties 
discussed in Section 4.3. Unless otherwise stated, we shall suppose that 
the pair (G„,M„) satisfies the regularity conditions (G) and {M)m stated 
on page 116, for some parameters en(G) and c„(M) > 0. When these 
conditions are met the nonlinear Feynman-Kac semigroup $p,n has several 
regularity and as)Tnptotic stability properties. These properties are often 
expressed in terms of the regularity parameters {vp^n, 0{Pp,n)) introduced 
in (7.3) on page 218. We also refer the interested reader to Chater 4 for a 
systematic study of these quantities. For instance, we have seen that, for 
any fixed p > 0 and for any n > p -I- m, we have 

((n-p)/”*)-! 

w.„) < n (i-'SL(c.a/)) 

fc =0 

rp,n S ^ [^p(-^) ^P,p+m(G)] 

With e^\G, M) = e2(M) ep+i.p+,„(G) and €p.„(G) = Ilp<,<„ «,(G). 

We recall that the central formula that allows us to coimect the stability 
properties of $p,n with the long time behavior of the particle density profiles 
is the following decomposition 

= ( 7 . 24 ) 

9=0 

The rationale behind this decomposition is as follows. When the semigroup 
is asymptotically stable, it naturally forgets any erroneous initi 2 il con- 
ditions. This property ensures that in some sense for each elementary term 

aS (n - ?) -4 00 

Consequently, we expect to prove a uniform estimate (w.r.t. the time par 
rameter) of the sum of the “small errors” induced by replacing at each step 




7.4 Strong Law of Large Numbers 245 



^piVp) by the Af-approodmation density profiles rj^. This strategy is not 
restricted to Feynman-Kac and particle models and applies to any kind of 
approximation schemes conducted by local approximations of the one-step 
mappings We also mention that this technique is well-known in the 
literature on numerical analysis of dynamical systems. In this context, and 
in contrast to chaotic ^tems the stability of the limiting model ensures 
that the local errors induced for instance by numerical roundoffs do not 
propagate. Because of its importance in practice, we have not chosen to 
house this strategy in the proof of some theorem. 

It is first convenient to introduce the random potential functions 



G!f-. 



C. TP. 



and the random bounded operators from Bb{En) into Bb{Eq) defined 
for any (/„,x,) € {Bb{En) x Eg) by 

~ j i^q,nf{^q) ~ ^q,nf{yq)) Gg„{yq) ^q{Vq-l){^yq) 



We associate with the pair (G^„, P^n)< *be random boimded and integral 
operator Qg„ from Bb{En) into Bb{Eg) defined for any (/„, x,) € {Bb{En) x 
Eq) by 

C(/n)(x,) = X Pg^Mn)(Xq) (7.25) 

Each ” local” term in (7.24) can be expressed in terms of follows. 

For any q <n and fn € Bb{En) with osc(/n) < 1, we have 



By construction, we also observe that 

%iVq-l) (Gg,n) = I and ^q{Vq-l) {Qq,niU)) = 0 

FVom the considerations above, we have the decomposition 
^q.niVq) ~ ^q,n{^<liVq-l)) = ) K " 

Using the properties of Dobrushin’s contraction coefficient, we have 

im')'n(/n)ll < OSC(P,.„/) < 0{Pg,n) 
ll<n(/n)ll < .(p ) 

lin.n(/n)ll-r,,n/^(P,.n) 




246 



7. Asymptotic Behavior 



Prom these estimates, we readily prove the inequality 

”itl> = <3j!»(/t.)/ll<?J!n(/t.)ll- Now, using Lemmu 7.3.3, we check 

that for any p > 1 we have 

Edlu," - I FlL,)'/” < 2 dif)'/" r,,„ 

with the sequence of finite constants d(p) introduced in (7.7). Using the 
decomposition (7.24), these “local” estimations readily yield the following 
theorem. 

Theorem 7.4.4 For any n > 0, p > 1, and /„ € Osci{En), we have 
^ E (11,;:' - rh]iUW)’ < 2 d(p)'/” •£ r,.„ 0(P,.„) 

9=0 



with the sequence of finite constants d(p) given for any p>\ by 



d( 2 p) = ( 2 p)p 2 -»' and d(2p- 1) = 2 -(p-i/2) 



In addition, suppose that conditions (G) and (M)m hold true for some 
integer m > 1 and some pair parameters {en{G),Sn{M)) such that e{G) = 
An€n(G) and e{M) = AnSn{M) > 0. Then we have the uniform estimate 



sup sup 
n>0/^eOsCi(En) 



V^E(l(»?i'-»?n](/n)r) 



1 ^ 2 d(p)V» m 

- e3(M) e(G)2m-i 



(7.26) 



Proof: 

Note that for any p < n we have the estimates 

0{Pp,n) < (l-e2(M)e(G)”*-‘)'^'*"'’^/”‘‘ 

rp.„ < (£-(’*-'’)(G) a {e-\M)e-^{G))) < £-^(M)£-"*(G) 

Since 

n [n/m] 

53(l-e2(M)e(G)"*-^)'’^’"' < m J]) (1 - £2 (m) e(G)’"-^)*' 

9=0 fc=0 

< — 

- e2(M) £(G)"»-i 



the end of the proof is clear. ■ 

Theorem 7.4.4 can be regarded as the extension of the first part of Lemma 




7.4 Strong Law of Large Numbers 247 



7.3.3 to interacting particle models. Arguing as in the proof of Corol- 
lary 7.3.1, from the Lp-inequalities stated in Theorem 7.4.4 we easily es- 
timate the moment-generating function of the particle density profiles. 
The proof of the exponential estimate results from a simple application 
of Markov inequality. 

Corollary 7.4.3 For any n > 0, /n € Osci(E„), and any e>0, we have 

E(e«^l’'n(/n)-’Jn(/n)l) < (1 + £ 6(„)/v^) (7 27) 

niv^ifn) - vMn)\ >e) < (1 -b ey/Nj2) (7-28) 

for some finite constant b{n) such that b{n) < P{^q,n)- 

addition, under the regularity conditions of Theorem 7.4-4> 

supE(e''^l’'"(^'*)-’'’*(^")l) < (l-t-efe/v^) 

n>0 

SUpP(l»?n (/n) - Vn{fn)\ > e) < (1 + £\/^) 

n>0 

for some finite constant b < 2m/(e^(M) 

Proof (sketched): 

We deduce the exponential probability estimate (7.28) from the first in- 
equality (7.27). Note that there is no loss of generality in assuming that 
b{n) > 1. On the other hand, by Markov’s inequality, we have for any e 
and t > 0 

P(l»7n (/n) - »7n(/„)l > s) = ne*'"" > e‘*) 

Using the first estimate stated in the theorem, we find that 

nivUiU) - Vn{fn)\ >£)<(! + b{n)t/VW) e^-‘^ 

Choosing t = {Ne)/b^{n), we arrive at 

P(l»?n (/n) - J?n(/n)| > e) < (1 + e\/N / {b{n)V2)) e~ 

< (1 + e\/iV/2) e~ 



Corollary 7.4.4 Let !Fn be a countable collection of functions /„ with 
ll/nll < 1 ond finite entropy I{^n) < oo. Suppose that the Markov transi- 
tions Mn have the form Mn{u, dv) = m„(u, v) p„(dv) for some measurable 




248 



7. Asymptotic Behavior 



function m„ on {En-\ x En) and some p„ G V{En). Also assume that we 
have |logm„(tt,t;)| < dn{v) with Pn(e^”) < oo and for some 

collection of mappings on £?„. Then for any n > 0 and p> I we have 

E {\\€ - %II^J ' < a(p) (^(^n) + b{n)]/y/N (7.29) 

with 6(0) = 0 and 6(n + 1) < r„ p„+i(e^»+‘) r,,„ 0{P,,„) and 

a(p) < c.[p/2]!. 

Proof: 

We use the decomposition 

Vn = [Vn ~ ^n{Vn-l)] + (^n(»?^^-l) ~ ^n{Vn-l)] 



drin 



to check that 
By Lemma 7.3.5, we have 

^ E (llp^ - $n(»?^Li)||^J ' < a(p) I{P„) (7.30) 

To estimate the second term, we observe that for any p G P(f^n) we have 



d^njp) ( s ^ p{Gn-l Tnn(.,t>)) 

dpn /i(G„_i) 

FVom this observation, we find the following decomposition. For any pur 
(|i,j/) G P(En) and for any v G E„-\, we have 



d^njp) 

d^niv) 



(v)-l 



t?(Gn-l) p(Gn-l mn(.,t;)) _ J 

/i(Gn-i) »?(G„_1 m„(.,w)) 



ff(Gn-l) . 




'p(Gn-l mn(.,v))' 




p{Gn-l mn(.,v)) J 


LMG'n-l) J 




_v(Gr,-i fn„(.,v)) 




f?(G„_i m„(.,v)) 



(7.31) 



Under our assumptions, we have the estimates 





< p 20 n(v) 


MGn-l) . 


,1 


KGn-l 


mni>,v)) 


d^n(v)^ ^ 


e 


V{Gn-l) 


“T 


V{Gn-l 


m„(.,v)) 



Consequently, we find that 



^ h/n!i) - »ln-l(/<!))| + \M^l) - Vn-l(A% 

(7.32) 




7.4 Strong Law of Large Numbers 249 



with 






Cn-l(u) 

Vn—l{dn—l) 



and /«?«(«) = 



Gn-ijv) mn{u,v) 



By Theorem 7.4.4, we get for t = 1, 2 and any p > 1 the estimates 



v/jvE(iT,i:L,(/^:i)-T,„_i(/W)r)»/p 



< a(p) r„_, e2«"(«) 0iP,,n-i) 

On the other hand, we have 

X _ - Vn-l{Gn-l mn(.,t>)) - („) 

dpn^’ dpn 



Consequently, from (7.32) we find that 



E 




d^n{Vn-i) 

d^niVn-l) 




< a(p) b{n) 



(7.33) 



with 6(n) < r„_i p„(e^*-) E^=o 0{Pq,n-i)- Thus, if we combine 

(7.30) with (7.33), we readily prove (7.29). ■ 

Using the same line of argument as in the proof of Theorem 7.4.4, we prove 
the following imiform estimate. 

Corollary 7.4.5 Assume that the regularity conditions stated in Corol- 
lary 7,4-4 

p{e^) =def. supp„(e(^")) <00 and J(.F) =def. sup/(5’„) < oo 

n>l n>0 



In addition, suppose that (G) and (M)m hold true for some m > 1 and 
some pair parameters {en(G),en{M)) with e(G) = An€„(G) and e{M) = 
A„£n(M) > 0. Then for any p>l we have 



\/N supE (||» 7 ^ - r/nll^J ' < a(p) 






m p(e^) 
£3(M)£2m(G) 



Theorem 7.4.5 For any n 6 N, we let Tn be a countable collection of 
functions f„ such that ||/n|| < 1 and satisfying the uniform entropy condi- 
tion I{F) = 8up„>o I{Fn) < 00 . Assume moreover that the semigroup 
is asymptotically stable with respect to the sequence (.^n)n>o in the sense 
that 

lim sup sup ||$,,,+„(/i,) - ^,,,+n(i^,)||* , „ = 0 




250 7. As)rmptotic Behavior 



When condition (G) holds true with infn>ien(G) e{G) > 0, then we 
have the following uniform convergence result with respect to time: 

In addition, let ns assume that the semigroup is exponentially stable 
in the sense that there exist some positive constant A > 0 and no > 0 such 
that for any n > no 

sup sup 11 $, ,,+„(/!,) - ^q,q+n{t'q) 

Then for any p> I we have the uniform estimate 

supiV“/2 E (||,,^^ - " < a(p) (1 + e^'/CF)) (7.35) 

as soon as N > exp (2no (A + A')) with 

a = and A' = 1 + log (1/£(G)) 



Proof: 

By Lemma 7.3.5 and arguing as in the beginning of the proof of Theo- 
rem 7.4.2, one proves that for any 0<q<n and p > 1 

^^N E (n$,.„«) - $,.n i^qiv^-i)) II^J' < o(p) 

By the decomposition (7.24), one concludes that for any 0 < n < T 

E (||»?^ - Pnll^J " < a(p) m (T -H l)A^(G) (7.36) 

On the other hand, for any g > 0 we have 

q-\-T 

WVg+T - Vq+rWrn < \\^r,q+TiVr)-^r,q+T{^riVr-l))\\j:^ 

r=9+l 

+ ||^9.9+T(»/f ) - $,,,+t(?/9)||j:^ 

Under our assumptions, we find that 

q-^T 

WVq+T - Vq+rhn ^ \\^r,q+TiVr)-^r,q+T{^r{Vr-l))\\j:^+e~^'^ 

r=q+l 

and arguing as above one gets that for any T > no 

supE (||»?^T - Vq+T\\jr) ' < e~^'^ + o(p) (T -f- 1) /(.F) (7.37) 

g>0 ViV 




7.4 Strong Law of Large Numbers 251 



Combiniiig (7.36) and (7.37), we readily prove the uniform estimate 

X^T 

supE {\\rj^ - »7n||^J ' < e-^'^ + a(p) ^ /(.F) 

n>0 V jV 



for any T>no and where A' = 1 - loge(G). Obviously, if we choose N >1 
and 



T = 



1 logiV 
2A + A'. 



+ 1 ^ Tlo 



where [r] denotes the integer part of r G R, we get (7.35). The end of the 
proof of the theorem is now clear. ■ 



We end this section with some brief comments on the long time behavior 
of interacting processes. For a more thorough study we refer the reader 
to section 12.4. For time homogeneous Feynman-Kac models and in the 
context of the statement of Corollary 7.4.5, the measure-valued process 
rjn admits a unique invariant measure t/qq (see for instance Chapter 4 or 
Chapter 5). In addition, we have for any p > 1 and n > 0 

E ih!! - Voofjr)' < 6 -Ml - p)L"/"*J^ 

for some constants p € (0, 1) and b < oo that depend on the pair (G, M). 
On the other hand, under the regularity conditions of Corollary 7.4.5, the 
Markov chain has a unique invariant measure on the product space 
E^, The estimate above provide an asymptotic estimate of the limiting 
empirical measures in terms of r/oo; that is, we have 

Urn E {WVn - Vooy) = 0 = lim lim E (||p^ - 7/oolk) 




8 

Propagation of Chaos 



8.1 Introduction 

This chapter is concerned with propagation-of-chaos properties of particle 
models. These properties measure the adequacy of the laws of the particles 
with the desired limiting distribution. They also allows us to quantify the 
independence between particles. Loosely speaking, the initial configuration 
of an iV-particle model consists of N independent particles in a “complete 
chaos.” Then they evolve and interact with one another. The nature of the 
interactions depends on the McKean interpretation of the limiting process 
(see Section 2.5.3). For any fixed time horizon n, when the size of the system 
N, tends to infinity, any finite block of q{< N) particles asymptotically 
behaves as a collection of independent particles. In other words, the law of 
any q particle paths of length n converges as TV oo towards the q tensor 
product of the n-path McKean measure. 

The interpretations of propagation of chaos differ from the different ap- 
plication particle model areas we consider. 

From the physical point of view, particle algorithms are often related to 
some microscopic particle interpretation of some physical evolution equa- 
tion. In this context, the limiting distribution fiow model is regarded as 
an infinite particle model. Here propagation-of-chaos estimates give precise 
information on the degree of interaction between the particles. They justify 
in some sense the well-founded microscopic particle interpretations. 

From a statistical point of view, particle methods are rather regarded 
as particle simulation techniques of complex path distributions. In this 




254 8. Propagation of Chaos 



context, propagation-of-chaos properties offer precise information on the 
numerical quality of these simulation techniques. First of all, they make it 
possible to quantify independence between the simulated variables. More- 
over, they guarantee the adequacy of their laws with the desired target 
distribution. For instance, in engineering applications such as in nonlinear 
filtering or global optimization problem, propagation of chaos ensures the 
adaptation of the stochastic grid with the signal conditional distributions 
or the Boltzmann-Gibhs concentration laws. 

From the biological perspective, the propagation-of-chaos of genetic mod- 
els gives precise information on their genealogical structure. More precisely, 
they quantify the degree of interaction between the ancestral lines of evo- 
lution of a group of individuals. They provide information not only on 
current populations but also on the complete genealogies of ancestral lines 
that have disappeared. 

We design three strategies with different precision levels. In the first one, 
we examine the propagation of chaos of the particle model associated to 
the McKean interpretation model 

•Kn+l,fj(x, .) = SnGff(x) .) -|- (1 — £fiGn{x)) ^n+l{v) (8-1) 

where £„ are nonnegative constants such that enGn < 1- Note that the pair 
of examples provided on page 219 fit into this model, and the case Cn = 0 
corresponds to the traditional mutation/selection genetic algorithm. We 
present a general and basic strategy that probably works for other McKean 
interpretations but does not give any information on the rate of propagation 
of chaos. Another drawback of this technique is that it is restricted to locally 
compact and separable metric state spaces. 

One important question arising in practice is to estimate the rate of 
propagation of chaos with respect to the pair parameters (q, n). This led for 
instance to propagation-of-chaos properties with respect to increasing par- 
ticle block sizes and/or time horizons. We first derive strong propagation-of- 
chaos estimates with respect to the relative entropy criterion. This strategy 
is based on an inequality of Csiszar on exchangeable measures. It allows us 
to restrict the anal)rsis to profile measures. The only drawback of this ele- 
gant entropy technique is that it requires some regularity on the mutation 
transitions. As a result, it doesn’t apply to path-space and genealogical tree 
models. 

The third strategy is not based on any kind of regularity property of 
the Feynman-Kac model but it is restricted, as presented, to the simple 
mutation/selection genetic model. We use as a tool a natural tensor prod- 
uct Feynman-Kac semigroup approach with respect to time horizons and 
particle block sizes. We derive several propagation-of-chaos estimates for 
Boltzmann-Gibbs measiures from a precise moment analysis of empirical 
measmes and firom an original transport equation relating q-tensor prod- 
uct and symmetric statistic type empirical measures. This analysis applies 
to the study of the asymptotic behavior of genetic historical processes and 




8.2 Some Preliminaries 255 



their complete genealogical tree evolution. In contrast to traditional studies 
on g-synunetric statistics, here the particles are nonindependent but inter- 
act with one another according to precise mutation and selection genetic 
rules. In this sense, these results can also be considered as an extension of 
the traditional asymptotic theory of g-symmetric statistics to interacting 
random sequences. 



8.2 Some Preliminaries 



To get an overview on Feynman-Kac and McKean particle interpretations, 
we recommend the reader to start his/her study of the propagation-of-chaos 
properties with the introductory section. Section 7.2. 

Definition 8.2.1 We say that the distribution is weakly Krio-chaotic 
if we have for any n € N, g > 1, and (i^)<>i € C6(£^(o,n))” 

\t=i / »=i 



This property can be restated in terms of the law of the first g path particles. 
To present this alternative description, let (g, N) be a pair of integers with 
I <q< N and for each 0 < p < n we set 






xEi 



and 



^(P.n) - %-hl.n] 



These sets represent the path space of a block of q particles from time p 
to the current time n. They are connected to the product spaces = 

(f;(p,„])« by the mapping 0« „ : defined by 



For p = 0, we slightly abuse the notation and write 0’ instead of 0 q_„. 
By P^^^ we denote the distribution of the first g-path particles 

"■‘k 5i),n| = (8 ft) ^ ** •’S’l = Law((a),s(<,) € 

V{Ef^ be their nth time marginals. For q = N, we simplify the notation 
and we write P^|» instead of P^n^^ In this notation, we see that P^ is 
weakly K,^ -chaotic if and only if we have that 

Urn PW’)(F) = K®%(F) 




256 8 . Propagation of Chaos 



for any 9 -tensor product function F = € Cb{E^Q „j). It is often more 

simple to derive this type of weak law of large numbers in terms of the path 
empirical measure 

t=l 



To be more precise, let be the set of all mappings from ( 9 ) = 

{ 1 ,..., 9 } into (N) = {1,...,^/^} and {q,N) C the subset of all 

(iV), = N\/{N - 9 )! one-to-one mappings. We associate with a pair 
of 9 -tensor and symmetric statistic type empirical distributions 



(1^)0, 



frf, , ^NO.nl «t0.n|) 

ae(N)M 

( »r\ ^ ^ 



ae{q,N) 



(8.3) 



In contrast to traditional 9 -symmetric statistics, the iV-random paths 
are non independent but they interact with each other according to some 
precise mutation and genetic selection rules. By symmetry arguments, we 
observe that for any F € „]) we have 

Tm:SHF) = <(J='(({io,„|).<is,)) = ((O®* W) 

The next central observation is that the empirical measures (8.3) are con- 
nected by a Markov transport equation of the form 

= where /d + (1 - i^) 4 «) 

and a Markov transition on ^j. We will give the proof of this result 

with a precise and explicit description of in the Section 8 . 6 . One easy 
consequence of this formula is that 

||(lC)®’-(nC)®’lltv < {l-{N),m < {q-l)VN (8.4) 
Using this property, we have the following lemma. 

Lemma 8.2.1 The sequence of distributions is weakly K,j^-chaotic if 
and only if for any n e N the random distributions converge in law to 
the deterministic measure 

Proof: 

We first suppose that is weakly K,^ -chaotic. In this situation, we have 
for any / e C 6 (£?[o,n]) 

E^( W(/) - - EX(/(f|M)/«|o.,]))l 



+E!;(/«|o..|)/«|o.n|)) - 2K.»,.(/)ES(/(f|V„|)) + 




8.2 Some Preliminaries 257 



Since is weakly K,^-chaotic, we easily find that 

and we conclude that converge in law to K,K>.n- reverse angle, if 
converge in law to itjo.n. we have for any F = € Cb{E?Q „.) with 

< 1 

KiUU ^(^O.n))) - m=l K^.n(^;t)l 

< |E^((lC)®’(i^)) - E^{{K^)^o{F))\ + \Eil^{{K^)^oiF)) - K®V(F)| 
= |E^([(IC)®’ - (IC)®’](^))I + KW^)) - ^(I^o.n)| 

with the bounded continuous function H on V{E[o „]) defined for any fi 6 
P(F(o,„,) by JI(m) = nil m(K)- By (8.4) we get ’ 






AT 



+|E5^(H(lC))-/f(K,^.n)l 

Since converge in law to K»x)»n> the proof of the lemma is easily com- 
pleted. ■ 

A stronger version of the propagation-of-chaos property is presented in the 
next definition. 

Definition 8.2.2 We sap that the distribution is strongly Kf^Q-chaotic 
if we have for any n G N and q> I 

Jim ||P(»J>-K®»J,. = 0 

By symmetry arguments, we see that is strongly K,^ -chaotic if and 
only if we have for any n > 0 

Jta sup|E»((lC)®’(F)) - K*«„(F)| = 0 



where the supremum in the display above is taken over all functions F € 

®«>(^0,n]) ***** II ^11 ^ *• 




258 8. Propagation of Chaos 

8.3 Outline of Results 

As mentionned in the introductory section, Section 8.4 is only concerned 
with weak propagation-of-chaos properties. We examine the particle model 
associated with the McKean transitions (8.1) introduced on page 254. When 
the state spaces (En,£n) are locally compact and separable metric spaces, 
we prove that the sequence of distributions is weakly K,^ -chaotic. To 

describe with some precision our main result, we let !!„ C C(>(£[o,n]) be the 
subset of all tensor product functions of the form 

Fn = fo<S)...®fn, where foeCb{Eo),...,fneCb{En) 

with Vp=oll/pll ^ 1- We also denote by II« C with g > 1, the 

subset of g-tensor product functions of the following form 

Hn = F^(S>...®F^, where FleHn 

Theorem 8.3.1 For any n G N and p>l, we have 

sup Ej(|lC(Fn)-K^.n(F„)r)i <o(p) b{n)/VN 

F„6n„ 

and 

sup lE^((lC)®''(Hn)) - K^„(Hn)| < a(p) b{n)/N (8.5) 

By the Stone-Weierstrass theorem, the set of ail finite linear combinations 
of functions in II„ is a dense subset of C 6 (.E[o,n)) as soon as the “marginal 
state spaces” En are locally compact and separable metric spaces. Using 
this density argument, we conclude that is weakly -chaotic. 

The next sections. Section 8.5 to Section 8.9, cover general measurable 
state-space models and discuss strong propagation-of-chaos estimates. To 
describe these results precisely, we let Qp,n, respectively Qpll, be the linear 
semigroup associated with the unnormalized Feynman-Kac distributions 
7 „ and respectively 7 ®’. Notice that Qp,n{fn) = Gp,n Pp,n{fn) with the 
potential function Gp,n and the Markov transition Pj,,„ 

Gp,n — Qp,n(l) and Pp,n{fn) = Qp,n{fn)IQp,n{^) 

(see for instance Section 7.2 for a brief overview on Feynman-Kac semi- 
groups). Let (Gp’n, Pp,n) be the corresponding pair potential and Markov 
transition associated with the semigroup Qp^l. 

As usual, the asymptotic estimates developed in this chapter are ex- 
pressed in terms of the parameters {rp,n, 0{Pp,n)) introduced on page 218. 
To simplify the notation, sometimes we write r„ instead of 
In Section 8.5, we discuss increasing propagation-of-chaos estimates with 
respect to the relative entropy criterion. The first main result is the follow- 
ing theorem. 




8.3 Outline of Results 259 



Theorem 8.3.2 Suppose the Markov transitions M„ satisfy the regularity 
condition (M)^^ stated on page 116, forp = 2 and some functions kn- 
Then for any q< N we have that 

iVEnt(pW«)|K®%)<6(n)g 

for some finite constant 

b{n) < c ^ (l + 7/p+i(|fcp+ip)) rg,pp{Pg,p)f 

p=0 9=0 

with\kn\= sup A^,(x„_i, .) G L 2 (t/„). 

Xn-l6Bn-l 

To illustrate another impact of this result in practice, we present here- 
after an easily derived consequence of Theorem 8.3.2. For simplicity, we 
further assume that the Feynman-Kac model (7.1) is time-homogeneous 
{En, Gn, Mn) — {E, G, M) and the following regularity condition is met for 
any x,y € E and for some m > 1 and e(G), c(M ) € (0, 1]: 



(G,M) : G{x) > e{G) G{y) and M'"(x, .) > e(M) .) 

( 8 . 6 ) 

Combining Theorem 8.3.2 with some well-known results on the stability of 
Feynman-Kac semigroup, we will prove the following increasing propagation- 
of-chaos properties. 

Let n{N) emd q{N), iV > 1, be respectively a nondecreasing sequence of 
time horizons and particle block sizes such that limAf_,oo n{N)q{N)/N = 0. 
In this situation, we have 



lim;v ^ 



q{N)n{N) 



Ent(pW’^ I K®%) < 



e6(M) e{G)*^ 



as soon as »?(|A:p) =def. sup„>i »?n(|*;n|^) < oo. 

As mentioned in the introduction, to analyze precisely the limiting be- 
havior of the path-space distributions (K^)®**, we develop in Section 8.6 to 
Section 8.9 an original approach based on g-tensor product and path-space 
Feynman-Kac semigroups. This strategy enters in a natural way the dy- 
namical structiure of interactions in the study of the propagation-of-chaos 
properties. It allows us to use the stability properties of the limiting system 
to derive precise and imiform estimates with respect to the time parameter. 
In Section 8.8, we express precise strong propagation-of-chaos estimates in 
terms of Dobrushin’s ergodic coefficient associated with a Markovian and 
Fe)mman-Kac type transition on a product space. This approach to strong 
propagation-of-chaos is restricted to the McKean interpretation model (8.1) 
with e„ = 0, and the corresponding simple genetic mutation/selection 
model. Our first main result is the following theorem. 




260 8. Propagation of Chaos 



Theorem 8.3.3 For any N >q>l, we have 




where 0{Pp^n) € [0,1], represents the Dobrushin ergodic coefficient asso- 
ciated with the Markov transition Pp% and ep^n '• (0, oo) (0, oo) is the 
collection of mappings defined by 



Cp,n(«) = (rp,n - 1)^(1 + (rp,n ~ 1) \/u) exp ((rp,n - 1)* u) 

( 8 . 8 ) 



The estimate (8.7) holds true for a fairly general and abstract class of 
Feynman-Kac models. It can be used to analyze the strong propagation- 
of-chaos properties of genetic particle s}rstems as well as those of the cor- 
responding genealogical tree models. 

We further suppose the regularity condition {G, M) is satisfied for some 
m > 1 and e(G),e(M) € (0,1). In this case we will deduce from Theo- 
rem 8.3.3 the following increasing propagation-of-chaos properties. 

Let n{N) and q(N), iV > 1, be respectively a nondecreasing sequence of 
time horizons and particle block sizes sudi that lim;^_,oo n{N)q'^{N)/N = 
0. In this situation, we have 



N 



lim^v-^oo II*'' ^ ))^ 

Note that Theorem 8.3.3 does not apply to the complete N-genealogical 
particle model ^[o.n]* Our second main result is the following theorem. 

Theorem 8.3.4 For any n,q,N >1 such that (n + l)q< N we have 

"2 r Mn + 1)? 



-(VO®- . ® »?n)’||tv < C 1) 



1 + Cn(' 



N 



■) 



(8.9) 



with the mapping e„(«) defined as in (8.8) by replacing the constants rp^n 

6yf„ = supp<„Tp,„. 

This second estimate readily implies the following increasing propagation- 
of-chaos property: If we have limjv_^oo ^(N)/N = 0, then for «my n € N 

IK*'"” - (-»> ® ■ • • ® S ‘(n) 

with b(n) < c (n -I- 1)®(1 •+• (r„ - 1)^). In the case of time-homogeneous 
models satisfying condition (G, M) for some m > 1 and e(G), e(M) e (0, 1), 
we shall also prove that 

t(«)<c(n-l-l)V(€"*(G)€(M))2 



In Section 8.7, we measure the propagation-of-chaos properties of Boltz- 
mann-Gibbs transformations. The complete proofs of Theorem 8.3.3 and 
Theorem 8.3.4 are housed in Section 8.9. 




8.4 Weak Propagation of Chaos 261 



8.4 Weak Propagation of Chaos 

This section is concerned with the following proof. 



Proof of Theorem 8.3.1: For any function F„ = ®p=o/p G !!„, we 
have the decomposition 



lC(i^n) - K^,n(Fn) = Kll(F„-i,n) ~ + A + (8.10) 

with = /o ® ® [f n-i Kn,f,n- A fn)] G C6(F[o,„-i]) and 



N 



h = 






n-1 



n 



1 ^ 



»=i Lp=o 



n— 1 



n /-Kp 



t=i Lp=o 

Conditionally on FJfLi, the N random variables 



\fn{Cn)-K,n-JfnWn-l)\ 






n/.(fp 

p=0 



(/n(a-i^n.n5'.,(/n)(^tl)] 



are independent and {Un\T^^i) = 0 for any i< N. Thus, by Lemma 7.3.3, 

we have VN E(|/i|^)p < a(p), for some finite universal constant a(p). To 
estimate / 2 , we observe that 

[-^n,T 7 ^_i ~ ^n,rjn-i]{fn) = (1 ” [^n{Vn-l) ~ ^n(^n-l)](/n) 

Using the Lp mean error estimates presented in Section 7.4, we find that 

Vn - Ar„,,„_.(/„)r)" < a(p) 6(n) 

This yields that VN E^dAI'*)' < a{p) b{n). Let J„ be defined 

Jn^y/N sup E5^(|lC(Fn)-K^,n(Fn)n" 

F„6n„ 



FYom previous calculations, we find that Jn < a{p) C{n) + Jn-i> and the 
end of the proof of the first assertion is now straightforward. To prove the 
mean value estimate, we note that 

h = lC-l(n-l) l*n(ll,) - *n(>),-l)l(/n) 
with Fl^_i = /o ® ® [/n-i(l “ ^n-iGn-i)). We use the decomposition 

E5^(/2) = E^((lC_i - " ^n(»?n-l))(/n)) 



+Kn_i,^(F4_l) Ei)^(($n(r?^Li) - ^niVn-lWn)) 




262 8. Propagation of Chaos 



By the Cauchy-Schwartz inequality, we find that |E^(/ 2 )| < b{n)/N. Con- 
sequently, if we set 

Jn = N sup |E^(Xi^(F„))-K^.„(F„)| 

F„€n„ 

by (8.10) we find that < Jn-i + c(n). This clearly ends the proof 
(8.5) for g = 1. Suppose (8.5) holds true at rank p = (g - 1). We use the 
decomposition 

9 9 9 / 9 9 

jjui-Uvi = («i-vi) (n“<“ 11’^* 

i=l »=1 »=2 \«=2 i =2 

( Q Q 

i =2 t =2 

the induction hypothesis, and the Cauchy-Schwartz inequality to prove that 
for any (f5)i<i<, € II; 

iEij(n;.i •«'('?)) -n?.iK".a,(f5)i 

< |E^(lC(f’i))-Kn,,^(Fi)| -H^(l + a(g-l)) 

The end of the proof is now clear. ■ 



8.5 Relative Entropy Estimates 

In this section, we provide strong propagation estimates with respect to the 
relative entropy criterion for the interacting particle S}^tem associated with 
the McKean interpretation model defined in (8.1). Without further men- 
tion, we assume the Markov transitions M„ satisfy the regularity condition 
(M)^^ stated on page 116, for p = 2 and some functions kn. 

The main simplification due to this condition is that the law of the 
iV-particle model is absolutely continuous with respect to the laws of N 
independent copies of the limiting distribution model. In this context, a 
natural tool for the analysis of a strong version of the propagation of chaos 
for mean field interacting particle systems is the following inequality due 
to Csiszar [69]. 

Lemma 8.5.1 (Csiszar) Let {E,£) be a measurable space and let be 
an exchangeable measure on the product space such that 
for some ri G P{E). If I <q< N, are the marginals of on the 

first q-coordinates, then we have 

Ent(p(^>«> I T/®«) <jf(l + Ent(p<^) | g®^) (8.11) 




8.5 Relative Entropy Estimates 263 



where [a] is the integer part o/o e R and {a} = a - [a]. 

Proof: 

The proof of (8.11) is quite simple. FVom the variational definition of the 
relative entropy 



Ent(/i|q)= sup M/) “ log»?(exp(/))} 
fec,(E) 

we have already Ent(/i^^) | ~ logq®^(exp with 

[N/q] 

. . . ,Xn) = ^ • • • )®(p-l)g+fl) ) ‘fi ^ 

p=l 

Since = [iV/q] and »;®^(exp/^’)) = 

taking the supremum over (fi € one concludes that 

Ent(p(^) 1 7/®^) > [N/q] Ent(/i<^-«) | r;®«) 



We end the proof by noting that 



Voe [i,oo], 



J_ 1 [o] + {g} 

[a] a [a] 




Lemma 8.5.2 If p is absolutely continuous with respect to q and ^ € 
L 2 (»/)> then we have Ent(/i|7;) < ||1 - d/i/dTjH* 

Proof: 

Using the standard inequality, logu < u - 1, which is valid for any u > 0, 
we clearly have 

Ent(„|,) = | (|-l) / (|-i) I*, 



from which one concludes that Ent(^|»/) < / - ij drj = 



-1 



2 

2 .«? 



As we mentioned above, our strategy to obtain relative entro^r estimates 
is based on the observation that 

is a “regular” mean field Gibbs measure. To describe the potential function, 
we use the the change of coordinate mapping 

©n (4o, • • • ,$n)) = (C|0,n])l<«<Af 




264 8. Propagation of Chaos 



defined in (8.2) to check that for tmy F 6 we have 

E(i^((efo.nl)l<i<A^)) 



= E(F(eiT(Co,...,U))= f,,, F{e^!{xo,...,Xr,)) 



np=‘o nf=i ° m-Hd{xo, . . . ,x„)) 

with Xp = (®p)i<»<AT e Ep and m(xp) = Therefore we find 

that 

nmUn])l<i<N) 



= E(F(0i)^(^o,...,^n))) 



/(N) ■^^(0n(®O,---,Xn)) 



with the interaction Hamiltonian function [0> oo) given for 

any ®( 0 ,n) = (®[o,n])i<<<^ ®(0.n) ^ • • • >4) by 

ni!^\x[0,n]) 



n-1 - 

/ 

p=0 •'^pX^P+1 



m(xp,xp+i)(d(«,w)) 



log 



d^p+i.m(xp)(a, 0 . . 
dFp+r,^p(u, .) 



One concludes that the distribution P^n € „j) is absolutely contin- 
uous with respect to the tensor product measure K®^ € and we 

have 



dP 



,(N) 



Vo,n 




= expffW 




To get one step further in our discussion we consider the function 



M„ : fi € F(F[o,n]) — ^ € 'P(F[o,„]) 

that associates to a given path measure fi on F[o,n] the McKean distribution 
associated with the flow fik of A:th time marginal and given by 



M„(^)(d(tio, . . . , u„)) = voiduo) X Ki,p^{uo, d«i) x . . . x (u„_i, du„) 

Note that the limiting McKean measure Kn,t/o = is a fixed point 

of M„. It is now easy to check that can be rewritten as 

= N W„(m(x[ 0 .„,)) 




8.5 Relative Entropy Estimates 265 



with the potential function Hn on 7^(E(o,„]) defined for any fi € V{E^o,n]) 
by 



= J log^^ dfj. 



n-l - 

■ 5 /. 



dKn, 

fip,p+i{d{u,v)) ^og 

lEpXE^+i o-Op+i,,,p(«, .J 



where fip,p+i stands for the (p,p + l)-th pair-time marginal of p. This 
readily shows that is an exchangeable distribution on the product 
and path space 



Lemma 8.5.1 hi g hli ght s the relations between the relative entropy and the 
mean value of the potential function More precisely, according to 
Lemma 8.5.1, we have that 

“(rSJ!«’ICn) s 2^Enl(ll><«!,|K®«) 

= 2 iE(ff<"'(eJ'({o,. ..,{„))) (8.12) 

We end this section with some regularity properties of these potential func- 
tions. First note that Hn(^,rto) = 0 and for any fi € P(£?[o,„)) we have 

E(^n(m(Clo,„,))) 




Kp+l,m(i,){^l,dv) log 



di^p-H,m({p)(4pi •) / A 

dA:p+x.,p(c»,.) 



= [Ent •) I •))] 

p=0 

By lemma 8.5.2, we find that 
|E(Kn(m(^(o,„]))| 




Kp+i,,X^dv) 



'P+1 



| d.f^p+l,m(gp)(^pi •) 
' dKp+l,r„{^l,.) 



(v) - Ip 



<^ e (/ dr)p+i 



dVp+i 

dKp+i,r,^{ep,.) 



_ dKp+i.„,U;,.) 
drjp^i df)p+i 



') 



X 



(8.13) 




266 8. Propagation of Chaos 



On the other hand, we have 

dKn-k-l,ixA^,>) 



= e„G„(u) A:n+i(«,w) + (1 - £nG„(u)) 



^n(^nfcn+l(*iV) 



Recalling that 



we find that 



Vn(G„k„+i(.,v)) d^„+i(rj„) 



f/n(Gn) 



drfn+1 



(v) = l 



dK„y,„„(u, _ .) ^ £„G„(«) kn+l(u,v) + (l-6nG„(u)) (> (l-£„G„(u))) 

arjn+1 



This implies that 



dK„+i,^„ (t 
“»Jn+l 



«^n+l 



= (1-£„G„(«)) - 



t7n(gnfcn4-l(»iV) 

Vn(Gn) 



and for any transition (xn,Xn+i) 



(1 - £„G„(u)) (Xn-M) < 1 



By (8.13), we readily obtain the estimate 






Since we have 



m(^p)(Gp) 



- 1 | = 



m(^p)(Gp)' 

^ l”*(Cp)(^p[^P+i(*) v) — 1])| 



< r^(l + Ikp^rliv)) H^pHfp^ - r,p(/;)l 



+ g>lW-,-)-H »<! G, = G,/V,(G,) 

we finally arrive at 

lE(K(m(C(o,„)))| 



< c E;Zo rl L,, Vp^i(dv)(l + Ikp+ilHv)) E(lm(^p)(/p - f^(/X) 




8.6 A Combinatorial IVansport Equation 267 



This simple estimate allows to apply most of the convergence results pre- 
sented in Section 7.4.2 and Section 7.4.3. For instance, by Theorem 7.4.4, 
we conclude that 

n—l p 

p=0 q=0 



This ends the proof of the theorem. ■ 

Arguing as in the proof of (7.26), we prove the following corollary. 

Corollary 8.5.1 Suppose that T/(|fcp) =def. (1 + sup„>i »/n(l*^P)) < oo 
and conditions (G) and {M)m hold true for some m > 1 with e{G) = 
A„€„(G) > 0 and e(M) = A„e„(M) > 0. Then, we have 




^ qn T?(|fc|^) 

- N e6(M) e(G)4"» 



8.6 A Combinatorial Transport Equation 

Throughout this section, {E, €) denotes an arbitrary measiurable space. In 
the further development of this section, we fix the integer N > I and for 
any x = (x^, . . . , we slightly abuse the notation and we denote 
by m(x) = jf € ViE) the empirical measure associated with the 

^-tuple X. For any 1 <q< N,we introduce the empirical measmes on E^ 
defined by 

aew<«> 

= 71^ X] *“<*>) 

' aG{g,N) 

with the sets (q,N), {q), 6uid (N) defined on page 256. Note that each 
mapping a G induces a unique equivalence relation on (q) defined 
for any i,j 6 {q) by 

* i '<=> a(t) = a(i) 

The corresponding set of equivalence classes (q)^ can alternatively be re- 
garded as a partition ttq of the set (q). More precisely, if 6(ffa) stands for 
the cardinality of the set a{(q)), then we have 

= {ira(l), • • • , 7Ta(6(7rQ))} with 7ra(t) tTqO) for any i^j 

and 



{q) = U-= i“^7ra(i) with 7ra(i) = {j £ (q) : a{j) = a(i)} 




268 8. Propagation of Chaos 



Inversely to each partition tt of the set {q) with b{ir) blocks, we can associate 
in a unique way {N)q different mappings a € To be more precise, 

let < be the order relation on the subsets of (q) defined for any A,B C {q) 
by 

A< B ■<=» inf {i : i 6 i4} < inf {t : t € B} 

Notice that the 6(7Tq) blocks of partition tt of {q) can be written in the 
increasing order 

’I’l < ’I’2 < . . . < H’6(,ra) 

We associate with tt and with each one-to-one mapping 0 G {b{n), N) the 
mapping aj G defined by 

6(ir) 

i=l 

Prom these one-to-one associations, we find the decomposition 

(iV)<«>=U’=iU,:6W=p{a5 : 0e(p,N)} 

In this notation, for any x G and any numerical function / on E’’, we 
have that 

E E 

P=1 7r:6(ir)=p (3e{p,N) 

= i^E E E 

P=1 n:b{fc)=p 0e{PiN) 

with the Markov kernel CJ’’ from EP into E^ defined by 

C?>«(/)(x\ . . . ,x»’) = / fx^x‘U,(l), . . .,j^x%,{q)\ 

\»=i «=i / 

It is now convenient to observe that for any p < g we have 

ofc 

with the extended Markov kernel „ from E^ into E^ defined by 
Prom previous considerations, we arrive at 




8.6 A Combinatorial IVansport Equation 269 



with the Markov transitions C^, p < 9 , on E'> defined by the formula 




E 

0e{q,N) 



1 

S{p,q) 



E c?.. 

ir:6(ir)=p 



In the formulae displayed above, 5(p, q) stands for the Stirling number of 
the second kind corresponding to the number of partitions of q elements 
in p blocks. Using the fact that S{q, 9 ) = 1 and ^ = Id, we prove easily 
the following result. 

Proposition 8.6.1 For any x € and l<q<N,we have 

m(x)®’ = m(x)®’iiy with = fly 

and the Markov kernel fl^ on E^ defined by 

= N, - (A,), E Wp «(!>.«) Cf 



One easy consequence of this formula is that 






< 

< 



0-^) 



(8.14) 



We end this discussion with a more probabilistic connection between 

m(x)®’ and m(x)®’. We first observe that for any 9 > 1 and any / on 
fl«+i 



(m(x) (g» m(x)®‘')(/) 



N 



= w7^E E 

' ■'* «=1 ae{q,N) 



N{N) 



« ae{q+l,N) 
1 



N{N) 






’ a€<9,N> »=1 



= (1 - ^) m(x)®(«+')(/) + ^ m(x)®(’+i)(f(’+‘)(/)) 




270 8. Propagation of Chaos 



with the Markov transition f,+i on defined by 



• ,x’) = i ^/(x‘,xS . . . ,x’) 

^ i=l 



This readily yields that 

m(x)®m(x)®« = m(x)®(’+iVSj-^‘) with = 

The probabilistic interpretation of is quite elementary. Starting 
from a given configuration (x°,x^...,x’) € the Markov transition 
consists of keeping this (g + l)-tuple with a probability (l - ^) and other- 
wise replacing the first component x° by choosing randomly and uniformly 
one of the other components x^, . . . ,x’. To develop an inductive construc- 
tion, we associate with a given transition r on some product space E^ a 
transition Ext(r) on some product space by setting 

Ext(r)((x°, x\ . . . , x«), d{y°, x^ . . . , y«)) 

= (Jjo (dy°) r((xS . . . , x«), d(x^ , . ■ . , y’)) 

In this somewhat abusive notation, we have for instance 

m(x)®* = m(x)®^r|J^ 
m(x)®® = m(x) ® m(x)®* 

= m(x) ® (m(x)®*r^^) 

= (m(x) ® m(x)®*)Ext(rjJ^) = m(x)®^rjJ^Ext(r^^) 

More generally, if we define using backward induction 

with 

then we conclude that m(x)®’ = m(x)®’7^jj^ To describe more precisely 
the Markov transition we introduce a sequence 
of q independent and {0,1}- valued remdom variables with respective dis- 
tributions 

P(ej«)=0) = l-P(£S’^ = l) = ^ 

Notice that for i = g we have = 1. We also associate with a given con- 
figuration (x*, . . . ,x^) 6 E® a collection of independent (smd independent 
of random variables with respective distributions 

p(x(9-*) g dy) = ^ £J=»+i ^x 3 with the convention = x’. FVom 

the inductive construction of 7l^^\ we observe that the E^-valued random 
variable 

X<9) = (l(9.1),...,2<9*9)) with x<«-‘) = ej’V -I- (1 - €[’^)x(«’*) 




8.7 Asymptotic Properties of Boltzmann-Gibbs Distributions 271 



is distributed according to , x’), .). 



8.7 Asymptotic Properties of Boltzmann-Gibbs 
Distributions 

Let /i be a probability measure on a given measurable state space (E,£). 
During the further development of this section, we fix an integer N > 1 
and we denote by m{X) = ^ ^x* Af-empirical measure associ- 

ated with a collection of independent and identically distributed ramdom 
variables X = (AT’)i>i, with common law p,. We denote by m(X)®’ and 
m(X)®'*, q< N, the random distributions on defined by 

a€{N)M 

m(X)®’ = ^(xod) xoM) 

(with the sets {q,N), (g), and {N) defined on page 256). Let g = (pt)t>i 
be a collection of 5-measurable and nonnegative functions on E such that 
p{gi) € (0,oo) for each t > 1. For any fixed integer q > 1, we denote by 
the g-tensor product function on E^ defined by 

gM =g^ : (x^...,x’) € — > ffi(xi) . . . </,(x®) € (0,oo) 

We note that 

m(X)®’(g(«)) = nm(X)(g,) and = f[m{X){gi) 

i=l t=l 

It is also convenient to introduce the mapping from (0, oo) into itself 
and defined by 

e^,j(u) = osc^(g)(l -I- osc^(g) y/u) exp(osc^(g) u) 

with osc f,{g) = sup^>i 0 sc(gi/^(gi)). When the potential functions g are 
chosen such that p{gi) = 1 for any t > 1, we simplify the notation and we 
write Cg instead of to emphasize that the function does not depend on 

p. 

We associate with the pair {g,q) the Boltzmann-Gibbs transformation 
^r(9) ; -p(^E0) V{E0) defined for any (g,/) 6 P(£'') x Bb{E<i) by the 

formula 




272 8. Propagation of Chaos 



The main object of this section is to analyze the asymptotic properties of 
the random (^tributions as the pair parameter {q, N) tends 

to infinity. Oin main result is the following theorem. 

Theorem 8.7.1 Let (*)<>! 6e a collection of measurable functions gi with 
uniformly bounded oscillations oec{g) = supi>i osc(pi) < oo. For any N > 
q>l and f € with osc(/) < 1, we have 

|E(®“(m(X)®')(/)) - [l + V. (^)] (815) 

and for any n > 1 

E(('i'(«)(m(X)®«)(/) - 4 '(«)(/i®«)(/)]2’») 

<c2*-^ [l + ep,p(^)] (8.16) 

Theorem 8.7.1 will be proved at the end of the section. In order to prepare 
for its proof, we first present a technical lemma of separate interest. 

Lemma 8.7.1 Let {gi)i>i be a collection of measurable functions gt with 
uniformly bounded oscUlations oec{g) = supj>x oBc{gi) < oo and such that 
ti(gi) = 1 for any i > 1. Then, for any n>l we have 

I E([m(X)®’(s(»)) - ID I < 2'*-* ^ e, (8.17) 

Proof; 

We first prove (8.17) for n = 1. Using the decomposition 

jj(n-oi) = i+ Yi 5T 

i=l 1<P<9 1<»1< * <»P<9 

which is valid for any q>0 and any collection of real numbers (oi)j>i, we 
find that 

E (nm(jr)(9,)) - 1 = E E ® I 

\i=l / 2 <p< 9 l<«i<-"<»p<9 / 

Using Holder’s inequality, we find that 

E[nm(X)(ft))-l = E E(|m(X)(ff)-in 

\t=l / 2<p<q 

E{\m{X){g) - in = sup E{\m{X){gj) - 1^) 

»>i 



with 




8.7 Asymptotic Properties of Boltzmann-Gibbs Distributions 273 



Suppose g = is an even integer. In this case, using the first part of 
Lemma 7.3.3, we find that 

|E(nm(X)(ft))-l| < (2p), 

, ^ p2p+i (2p + l)p+i / 

h ^ V j 



In the display above, we have used the notation osc(p) = supj>x osc(ffi). 
Since we have the estimates 



Cll (2p)p 

(2p+l)p+i 



1 {2q')\ ( 2 (? 02 p ^ (2gQ^P 

p! (2q' - 2p)! p! ~ p! p! 

1 (2gQ! _ (2g^)2p+i ^ 

p! (2g' - (2p+ 1))! p! “ p! 



this also yields that 

|E(n^i»nW(Pi))-l| 



1/^2 \ P ? 1 / />2 \ P + 1/2 

P=1 ^ ' P=1 ' 



< (1 + osc(p) q/y/W) 



9/2 



E 






P 



Recalling that ^ ^ f Sp=o ^ ^ n>0 and e > 0, we 

arrive at 



E^nmCXKft))- 



\t=i 




with eg(ti) = osc^(p)(l + osc(p) \/u) exp (osc^(p) w). The proof for odd 
integers 9 = 2</' + 1 is derived in a completely analogous fashion. This ends 
the proof of (8.17) when n = 1 . Next we prove (8.17) for even integers 
n = 2n', n' € N. We use the decomposition 



2n' 

E((m(X)®’(p(’))-l]2"') = (-l)'’E([m(X)®’(p(’))n 

p=0 

= /l + ^2 + *^3 




274 8. Propagation of Chaos 



with 

n' 

p=0 

p=0 

h = Ecg.-Ec£'‘=<> 

p=0 p=0 

Next we observe that for any n > 1 we have 



with 



= m(X)®« ® ® m(X)®« and ® . . . (g> </' 



(«) 



n times 



n times 



Prom previous considerations, we find that 

n' 

p=i 

. (2fx?)^ / (2qp)^ \ ^ /(ng)^\^ 2p 

- Z^^2n' 27V 2N /- 2iV 2AT 2n' 

P=1 p=l 



Using similar arguments, we find that 



1/2I < 



jnq? 

2N 






from which we conclude that 

|E([m(X)®«((/(«)) - 1]")| < 2 

The proof of this estimate for odd integers n = 2n' + 1 follows the same 
arguments. This completes the proof of the lemma. ■ 

The proof of Theorem 8.7.1 will be easily established as follows. 





8.7 Asymptotic Properties of Boltzmann-Gibbs Distributions 275 



Proposition 8.7.1 Let {gi)i>i be a collection of measurable functions Qi 
with uniformly bounded oscillations osc{g) = sup^>i osc(^i) < oo and such 
that ^{gi) = 1 for any i > 1. For any n > 1, AT > g > 1, and f € Bb(E^) 
^th II /II < 1 and osc(/) <\, we have 







8 ) 



Proof: 

FVom Proposition 8.6.1, we have the Markovian transport equation 
m(X)®’ = with 4’^ = Id +(l- 

for some Markov kernel on and for any q< N. Since 

- Id) = (1 - {N\/N0) (fly - Id) 

and recalUng that E(m(X)®«(5(«)/)) = M®«(ff<«>/), we readily prove that 
E(m(X)®«(</(«)/))-p®«((/(9)/) 



= E(m(X)®<'[fly-/d](fl(’)/)) 



To estimate the r.h.s. term in the display above, we use the decomposition 

/i®«[fly - /d](5<’V) = Ii + h 



with 



h = M 















We observe that 

From these estimates, we find that 

|E(m(X)®«(5(«)/)-/x®«(</('')/)| 



< (1 - {N\/NO) [1 + 2i/i®''[fly - 
= (1 - {N)JN0) + 2 |/i®’[fly - 




276 



8. Propagation of Chaos 



Consequently we have 

|E(m(X)®’(5(’V))-M®’(5^’V)l < (1- W,/iV«)+2|E(m(X)®’(5(’)))-l 



and by Lenuna 8.7.1, this implies that 



|E(m(X)®’b“/))-,i®«(9<’>/)l < (1 - + ^ «» (^) 

Using the same line of reasoning as at the end of the proof of Lenuna 8.7.1, 
we also prove that for any n > 1 

|E([m(X)®’(ff(«>/) - /i®’(</(’)/))")l < 2"+' ^ 1 + e, (^)] 

This ends the proof of the proposition. ■ 



Proof of Theorem 8.7.1; 

By the definition of no generality is lost and much convenience is 
gained by supposing (as will be done) that we have /i(p<) = 1 for each 
i > 1. To prove (8.16), we use the decomposition 

«'(’)(m(X)®«)(/) - 4'<«>(/i®«)(/) = 'f<«)(m(A-)®«)(/-/i®’(p(«)/)) 

= Ji + /2 (8.19) 

with h = m(X)®«((;(«)(/ - /x®«(</(«)/)) and 

h = ¥^\m{X)^‘*){f - (l - m(X)®«(g(’))) 

It is now convenient to observe that 

^®«(<,(«)(/-^®«(5(’)/)) = 0 

||/-A*®’(5^’V)II < osc(/) = osc(/-/i®«(l?<’V))<l 

and for any n > 1 we have 

E([4'<’)(m(X)®’)(/) - >i'(«)(/i®«)(/)]2”) < (E(/2") +E(/|")) 

Therefore, using Proposition 8.7.1 and Lemma 8.7.1, we check that 

E([^(«)(m(X)®’)(/)-^>('')(^®«)(/)]2”) < c 2^” ^ [l + e, (^)] 




8.8 Feynman-Kac Semigroups 277 



This ends the proof of (8.16). To prove (8.15), we use again the decompo- 
sition (8.19). By (8.18), we find that 

|E(A)l<c^ Il + e,(^)| 

To estimate the mean value of I 2 , we first use the Cauchy-Schwartz in- 
equality to check that 

|E(/ 2 )|" < E([$(«)(m(A-)®«)(/ - M®’(5^’V))]") E([l - m(X)®«((/(’))]2) 
Via (8.17) and (8.16), this implies that 

|E(«l<c^ |l+e,(^)| 



firom which we conclude that 



|E(^(<')(m(X)®«)(/)) - ^(’)(/x®«)(/)| < c ^ 




This ends the proof of the theorem. 



8.8 Feynman-Kac Semigroups 

To analyze propagation-of-chaos properties in path space, it is convenient 
to consider the Feynman-Kac tensor product distributions on path space 

Kn = (% ® ® r/„) G V{E[0,n]) 

By the definition of $p,n, we have Kn = fip,n(Kp), for any p < rij with 
the (nonlinear) semigroup f2p,n : P(-E[o,p]) — > 'P(FJ[o,n]) defined for any 
M € 'P{E[o,p]) by 



f^p,n(/^) — A* ® ^p,p+l(/^p) ® ^p,p+2(/*p) ® ® ^p.n(Mp) (8.20) 

In the display above, Hp € V{Ep) stands for the pth time marginal of fi 
defined for any <pp € Bb{Ep) by 

fhi^Pp) = 

p times 

Again we use the convention 0„,„ = Id for p = n. To check that Qp,„ is a 
well-defined semigroup, we observe that for emy fi € V{E[o^p]) we have 

fip,p+i(M) = M ® ^p,p+i(A‘p) 




278 8. Propagation of Chaos 



It follows that 

fip+l,n(fip,p+l(/i)) 

= fip+l,n(A‘®^p.p+l(Mp)) 

= /i ® $p^p+i(/ip) ® $p+i,p+2(^P,P+i(Mp)) ® ® ^p+l,n(^p,p+l(f*p)) 

= /i ® $p^p+l(/ip) <8> $p,p+2(/ip) ® . . . ® ^p,n(/^) ~ ^p,n(M) 

In the forthcoming development of this section, we fix a positive integer 
q>l and we denote by the g-tensor product Feynman-Kac measures 
defined by 

K<?> = .^®...®.)®«€P(B|“|) with b“|=£Sx...x£J 

Notice that Kn ^ o (Qn)~^ change of coordinate mapping 

I,] — ^ „] defined in (8.2). The next two subsections are devoted 
respectively to the study of the dynamical structure of the tensor product 
distributions and . 

8. 8. 1 Marginal Models 

We observe that can alternatively be defined for any / € Bb{En) by 
the Feynman-Kac formulae 

»7f’(/)=7r(/)/7®’(l) with 7 ®’(/) = E(t 

“ \ P=0 J 

where 

• E^a,(.) represents the integration with respect to the law of 

q independent copies Xn^ = (Xn’^\Xn'^\...,Xi^’^^) € of a 
Markov chain with initial distribution r/o 6 V{Eq) and Markov tran- 
sitions M„. In other words, Xn^ is a nonhomogeneous and E’- valued 
Markov chain with transitions 

■ ■ .iS)) = nK(xL,,<fei.) 

i=l 

• Gn^ : E% (0,oo), n > 0, is the sequence of g-tensor product 
potential functions defined for any (x^, x%) G E^ by 

G<?)(xi,...,x«) = nG„(xj.) 

i=l 




8.8 Feynman-Kac Semigroups 279 



This rather simple representation indicates that the sequence of distribu- 
tion flows and 7®^, ^ > 1, has exactly the same semigroup structure. 
Let and respectively ^ boimded integral operator from 
into and the mapping from V{E^) into V{E^^i) defined for any 
(,,/) e P(BJ) X 8»(£;+,) by 

«!+.(/) = ud = 

with the Boltzmann-Gibbs treuisformations 'i’n ^ on V{E^) given by 

^n\v){dXn) = — Gi'')(x„) T]{dXn) 

By the Markov property and the multiplicative form of the Feynman-Kac 
models, we prove that the distribution flows 7®^ and r/®^ satisfy the recmr- 
sions 

We let and p < n, be the semigroups associated respectively 
with 7®’ and »j®’. That is, we have that 

= and = 

As usual, we use the convention Qn}n = Id and = Id for p = n. 

Our final objective is to provide a Boltzmann-Gibbs representation of 
the semigroup To this end, we let G^Ji : E^ ( 0 , 00) and Ppjil be 
respectively the potential function and the Markov transition from E^ into 
E^ defined for any / € Bb{E^) by the formulae 

= and P^{fn) = Q^^XifnW^lW 

If we set Gp,„ = gp,n(l)i then we find that for any (Xp, . . . ,xj) € 

oj,i(4 x«) = <3p,„(i)(xJ). .<3p,,(i)(x;) 

= Gp,„(iJ)...Gp,„(iJ) 

From previous considerations, we readily see that for any p € P(E^) we 
have 

HiM = (8-2>) 

with the Boltzmann-Gibbs transformations on V{E^) associated with 
the potential function and defined for any (/x, /) 6 V{E^) x Bb{E^) 

by ^p?n(p)(/) = KGpX f)/fi{GpX)- 




280 8. Propagation of Chaos 



8.8.2 Path-Space Models 

To describe the dynamical structure of the semigroups introduced in 
(8.20), we first observe that for rj € P{Ep) and F € B6(£?(p,„j) we have 

i%,P+i{v) ® ^p,p+2iv) ® ® ^p,niv)) (F) 




1 

vQp,p+k{i) 



{vQp,p+i ® ® vQp,n) {F) 



r}^^^-P){Tp,n{F)) 

^®(n-p)(Tp_„(l)) 



with the bounded integral operator Tp,„ from into £^(p,n) defined for 

any (xj, . . . by 



'Fp,n(F)(^pt ■ • • — [ JJ Qp,p+k{Xp,dXp+k) F{Xp+i, . . . ,Xn) 

•'^(p.n) k=l 

Also observe that the mapping Tp,„(l) coincides with the (n - 9 )-tensor 
product potential function 



n-p 



Tp,„(i)(xi, . . . ,x("-")) = n Qp.p+fc(i)(x‘) 



fc=l 



In other words, in terms of the potential functions Gp,n = Qp,n(l), we have 
that 

Tp,n(l) = G'p,p+i ® (^p,p+2 ® ® Gp,„ (8.22) 

In this notation, (8.20) can be rewritten for any n G P(£'(o,p)) as follows 



i^p,n(/i)(0 = /X® 



/xp^^"-^^rp,n(.) 

/xf("-">Tp,n(l) 



= M®(5p,„[/i®(’*-»')]C^p,n) 



with 



• the pth time marginal distribution Pp 6 ViEp) of /x G V{E[o^p]) 

• the Boltzmann-Gibbs transformation Bp,„ on P(£p”~*’^) and the 
Markov transition Up,n from Ep”~^^ into ^(P.n) defined for any pair 
{u,f) G (P(4"-»’)) X Bb(4”-'’))) and F G B6(F(p,„,) by 



Bp,n(«x)(/) 



»^(rp.n(l) /) 

«'(7’p,n(l)) 



and Up,„{F) = 



TpAF) 

^P,n(l) 



This updating-prediction type representation of the semigroup pro- 
vides a precise description of the dependence of with respect to 

the measure u. Next we present a formula that emphasizes the role of the 
one-step mappings 4>p in the dynamical structiure of these transformations. 




8.8 Feynman-Kac Semigroups 281 



Lemma 8.8.1 For any p > 1 and rf € V{Ep^i), we have 

= $p(7?) ® (5p.„[$p (»/)®(”-'’^l(/p.„) 

Proof: 

By the definition of the operator Tp^rn we have 

P+ ^Tp-i^fi = [vQp-ljp] ® [^Qp-l,p+l] ® • • • ® [^/Qp-l,n] 

“ ivQp) ® [(^Qp)Qp,p+i) ® ® [{vQp)Qp,n] 

This implies that 

^®<"-P+'>Vl.n(l) = »K?p(l) X [(r?gp)®("-'»rp,n(i)] 

On the other hand, for any (pi 6 Bb{Ep) and (p 2 € Bb{E{p,n]), we have 
^p(»?)(¥’i) = vQp{(pi)/vQp{l) and 

{flQp)^^^-’>%A<P2) $p(T?)®('*-P)Tp,n(v?2) 



(^p)®(n-p)Tp,„(l) $p(,?)®(n-P)Tp.„(l) 

= 5p.„($p(„)®("-»»)f/p,„(^2) 

Rrom these observations, we find that for any / 6 Bb{Ep x £^(p,n]) 

tSp.,,n\ri )Up-l,n 



= $p(»?)®(Bp.n(^p(»7)®<"-'’>)C^p,n] 



This ends the proof of the lemma. 



Rrom the Feynman-Kac representation of g-tensor marginal distributions 
given in Section 8.8.1, we see that the semigroup structure of the g-tensor 
product measures on path space 

KS" = %®'®...®-)®’€P(£|S|) 

can be studied using the same line of argument as above by replacing the 
pair semigroups (Qp.n, ^p,n) by the g-tensor product semigroups (Qp’n, $p’l) 
We will use the superscript to define the corresponding mathe- 
matical quantities. To be more precise, let (Ip^l, 0 < p < n, be the (non- 
linear) semigroup associated with the distribution flow and given by 

Kn ^ = fip’i(Kp’^). FVom the preceding construction we check that Qp^l can 
be described for any p 6 T(Eq x ... B’) by the formula 



where 



282 8. Propagation of Chaos 



• /i®^" is the (n - p)-tensor product distribution of 

the pth time marginal /ip € V{E^) of /i 

• is the Markov transition from into ^'’p_„](= ^+i x • • • x 

£■«) and defined by c4?2(F) = T^^1(F)/T^f2(l) for any F € Bb{El^ ^^) 
with 

~ / (,) n ^f>!p+fe(®P’ *^P+fc) ^(®p+l> • • • ) ®n) 

•'^(p.nl fc=l 

• Fp’n is the Boltzmann-Gibbs transformation on defined 

for’ any pair (uj) € x by 

As in (8.22), we note that the nonhomogeneous potential functions 
rj?n(l) Me given 

T^>m = cW„®G^'+,®...®G<?i 

In other words, in terms of the potential functions Gp,n for any 
(xl ,..., ) e with X* = (x*'-i , . . . , x*'’) e 

for each 1 < A: < (n - p), we have 

T^’^(i)(xi , . . . , x^"-")) = ^”n n ^p.p+*(=^p ‘) 

fc=l i=l 

We end this section with the version of Lenuna 8.8.1 in the context of 
g-tensor product semigroups. 

Lemma 8.8.2 For any q,p>l and Tf € F(£^_i) we have 



8.9 Total Variation Estimates 

This section is mainly concerned with the proofs of Theorem 8.3.3 and 
Theorem 8.3.4. We recall that the forthcoming analysis is restricted to the 
simple mutation/selection genetic model. 




8.9 Total Variation Estimates 283 



Proof of Theorem 8.3.3: We use the decomposition 

= E (8-23) 

p=0 



with the convention ^-i,o((»/-i)®’) = for p = 0. Our next objective is 
to estimate the differences of measures 



lii =d.t, [»S(«)®») - 



Using (8.21), we first observe that $p^ij.({tip^-i)®^) = ^p(tipl.i)®^ and for 
any / € Bb(E^) 

The conclusion now follows from Theorem 8.7.1. First we notice that for 
any /x € V{Ep) we have 



OSC^ (Gp^n) 



OSC(Gp,n) 

/^(Gp,n) 



^ (^p,n 1) 



with 



'Pin 



= sup 



Gp,n(^p) 



Xp,Vp£Ep Gp, 



,n(j/p) 



(8.24) 

Therefore, recalling that is the empirical measure associated with a 
collection of N conditionally independent and identically distributed ran- 
dom variables with common law ^p(r?^i), we find from (8.15) the P^-a.s. 
estimate 



|E^ia^?i(/)l^p-l)l ^ 77 [1 + (^)l 



We recall that, for any Markov transition M from a measurable space {E, €) 
into a (possibly different) measurable space {E\ S') and for any / 6 Bb(E'), 
we have the inequality osc(M(/)) < /3(M) osc(/) (see (4.11) on page 128). 
FVom this property, we conclude that, for any / € Bb{E^) with osc(/) < 1, 
we have 










By (8.23), it follows that, for any / G Bb{EX) with osc(/) < 1, we have 



p=0 




Taking into account that 







284 8. Propagation of Chaos 



Lemma 8.6.1 ensures that, for any / e Bh(E%) with osc(/) < 1, we have 

iChW-’^’WI 




This ends the proof of Theorem 8.3.3. ■ 

To illustrate the impact of Theorem 8.3.3, we present hereafter some easily 
derived strong and uniform propagation-of-chaos estimates. 

Corollary 8.9.1 Let us suppose that the triplet {En,Gn, Mn) is time-homo- 
geneous and the regularity condition (G, M) introduced in (8.6) on page 259 
is met for some e{G), e{M) > 0, m > 1. Then we have 

s ^ (i + ■*’(«.") ii + 't;’(V/w)i) 

where 

4^(9, n) = f^(l-€^(G,M))lP/-l<(n + l)A(m£-«(G,M)) 

p=0 

with em{G,M) = e(”*“*)(G) e{M) and Cm («) is the mapping defined as 
ep,n(«) (see (8.8)) by replacing the constant rp,„ by rm^ = e“”‘(G)€“^(M). 

Proof: 

When the regularity conditions (8.6) are met, we recall that for any 0 < 
p + m < n we have the uniform estimate Tp,„ < c“’”(G)e“*(M) and for 
any ip, yp € and any nonnegative function <p on E^^^ 

(see for instance Proposition 4.3.3). By the definition of Dobrushin’s ergodic 
coefficient, this yields that 

Recalling that < e~(”~*’)(G), we observe that for any p < n 
rp,n < e-"'(G){e-\M) V 1) = e-’"(G)e-'(M) 




8.9 Total Variation Estimates 285 



and consequently supp<^ ep,n(ti) < Cm (u). Prom previous calculations, 
we easily find that 

IlCw - S ' w «i»'(2«"/JV)| <4'to,n)) 

This ends the proof of the corollary. ■ 



Corollary 8.9.2 Assume that the regularity assumptions stated in CoroU 
lary 8.9 A are met for some e{G), e{M) > 0, and m > 1. Then, using the 
same notation as there, we have the uniform propagation-of-chaos estimate 



sup 

n>0 



(^q 



tv 




(l+ me-«(G,M) [1 + e^^^W/N)]) 



In addition, for any nondecreasing sequence of time horizons n{N) and 
particle block sizes q{N) such that ]imN-i(x>n{N)q^{N)/N = 0, we have 
the increasing propagation-of-chaos property 

qHNHN) < c/(e”*(G)€(M))2 

The end of this section is concerned with the proof of Theorem 8.3.4. 
Our first task is to better connect the distributions 

PW’) = Law((^[o.„,)i<K,) € 

with the Markovian structure of the interacting particle model defined in 
(7.4). We recall that for each 0 < p < n and I <q < N the state space 

represents the set of the first q paths from time p to time n of the Markov 
particle model, while the product space ~ x • • • x En)^ represents 
the state space of the paths of each of the firk q elementary particles from 
time p up to time n. We shall also use the notation 

We recall that and are connected by the mapping 0’ defined 
in (8.2). For instance, for q = N we have 0iJ^($o. --,Cn) = ^[o,n] and 

•Ci = » (e?)-'. ° (f« «")■' e P(<.I|). 

In this connection, it is also convenient to associate with the pair of path 
measures ((K^)®’,(K^)®’) defined in (8.3) the distributions 

KUV.,)^(K^)®,o(0«)-i and Kf-’> = (O®«o(0«)-i€P(Eg)^^^ 




286 8. Propagation of Chaos 



In other words, we have with some obvious abusive notation 
K(^.9) = ^ Y a (,) 

^ Jifq ((fo )l<<<« Kin n<i<q) 

a€{N)if> 

" JWa ^ *^((«?‘*’)><<<» «n“’):<i<,) 

^ ’•>ae{q,N) 

Lemma 8.9.1 For any pair of integers \ <q< N and any test function 
F e with ||F|| < 1, we have 

lEj ( kW9)(F)) - E^{F{ia)i<i<q, (C)i<i<,))l < (9 - l)VN 

Proof: 

By Lemma 8.6.1, we observe that 

11(1^)®'' - (HC)®’lltv < (9 - 1)V^^ (8.25) 

By the exchangeability property of the particle model, we also have that 
for any a € ( 9 , N) and F € 

Kimo%<i<q^ • ■ - (C^‘^)i<i<,)) = Ki^m)i<i<q, • ■ • , (e;)i<K,)) 

This implies that 

E^{K<r’^HF)) = E;)^(F((^‘)i<k„ . . . , (^’ )i<K,)) 

However, (8.25) eJso ensures that 

\\^N,q) _ kW,) 11^^ < _ 1)2/^ (8.26) 

from which the end of the proof of the lemma is easily completed. ■ 
We are now in a position to prove the theorem. 

Proof of Theorem 8.3.4: In a similsu* fashion as in the proof of The- 
orem 8.3.3, we use the decomposition 

Kf .«/) - K(?) = ))] (8.27) 

p=0 

As usual, we take the convention for p = 0, To 

describe more precisely each term in the summand above, we first observe 
that, for the g-tensor product measure, (tj^)®’ € P(F|) is the pth time 

marginal of On the other hand, by the definition of the semigroup 

flp?n, we have 




8.9 Total Variation Estimates 287 



and 




This implies that for any 1 < p < n we have 



Let be the random measures defined by 

Using Lemma 8.6.1, we find that for any p < n 

l|n&"> - !$> < (« - 1)VW (8.28) 



As an aside, using Lemma 8.8.2, we already notice that 



l,n 



(8.29) 



Now, by (8.27), the estimates (8.28) imply that 






p=0 



<2{n + l){q-iy/N (8.30) 

tv 



with the convention, for p = 0, = K« ^ = t?®’ ® ®y 

symmetry arguments, it is now convenient to observe that, for any p < n 
and any test function (p G ®6(^^,n])’ sequence of random variables 



=k{I *’((«p)i<.<..»>iF/ii) 



does not depend on the choice of a G {q, N), where the integral is taken 
over the product space E^^n]- Using this property, we prove that for any 




288 



8 . Propagation of Chaos 



f ^ 8,(S(’V 

I 

= I fp-i) 

=E!;;([Kj^i''’®<)®'®4S((<)®’'"-’'')t'i?2K/) I F^L,) 

On the other hand, using the fact that - (»/^)®’||tv < ( 9 - 1)^/-^^, 

we find the P^-a.s. estimate 

I ^P-i)l ^ ll/il (9 - l)V^ (8-31) 

with 

=« K-f ® «)*' ® 

In the display above and for p = 0, we have used the convention 
In this notation, we have by (8.29) the formula 

Let B^n be the extended Boltzmann-Gibbs transformation on 
defined for any pair {y,<p) £ x by 

with the potential functions 2pjn(l) on given by 

f^{i) = 

In this notation, recalling that Gp,p = 1, we find that, for any p € ViE^) 
and 1 / € V{Ep^'^~^^)y we have = Bp’n(/i®t')- This readily yields 

that 

Q{g,N) q(9,AT> 

»p,n - “p-l,n 

Using Theorem 8.7.1 and arguing as in (8.24), we obttdn the P^-a.s. esti- 
mate 

N N )\ 




8.9 Total Variation Estimates 289 



as soon as{n + i)q< N and ||/|| < 1. This readily implies the rather crude 
and almost sure upper boimd 



,)l < c fi 



1+Cn 



/ 2(g(n + 1))^ 

V N 



(8.32) 

with the mapping e„(u) defined as in (8.8) by replacing the constants rp,„ 
by f„ = 8upp<„ rp,n- Combining (8.30), (8.31), and (8.32), we conclude 

that for any / e 06(f;(Q^„]) with 1|/|| < 1 



ESdld"''* - k!?'1(/))| < c (n + 1) |i + 



By the definition of V^’n and the total variation estimate (8.9) 

is now a simple application of Lemma 8.9.1. This completes the proof of 
Theorem 8.3.4. ■ 




9 

Central Limit Theorems 



9.1 Introduction 

The central limit theorem (abbreviated CLT) is one of the most startling 
results in probability theory. Loosely speaking, it expresses the fact that 
the sums of local and small independent disturbances (with finite variances) 
behave asymptotically, at least as Gaussian variables. The first CLT was 
stated and proved for symmetric and Bernoulli independent disturbamces 
by A. De Moivre in the 18th centiuy (Miscellanea analytica supplementum, 
1730). This result was extended by P.S. Laplace in 1812 to general Bernoulli 
trials in his celebrated treatise Theorie ancUytique des probabilites. 

These two pioneering studies have been further developed by several 
mathematicians, such as, in alphabetical order, Donsker, Dynkin, Feller, 
Jacod, Lindeberg, Lyapimov, Mandelbaum, and Shiryaev. These develop- 
ments were followed in various directions going firom multidimensional mod- 
els, symmetric statistics sequences, and empirical processes to fairly general 
classes of nonidentically distributed and dependent random variables such 
as properly scaled random triangular arrays or martingale sequences . There 
also exist several approximation tools, such as the so-called (5-method or 
Slutsky’s technique, to deduce various weak Umit results from a given CLT. 
It is of course out of the scope of this book to review in detail all of these 
developments, and rather we refer the interested reader to any classical 
textbook on probability and limit theorems for stochastic processes. 

Most of these CLTs are derived using extended versions of the celebrated 
Levy’s convergence theorem, which basically says that the weak conver- 




292 9. Central Limit Theorems 



gence of distributions corresponds exactly to the pointwise convergence of 
their characteristic functions. In this connection, we also mention that the 
inequality of Berry and Esseen provides an estimation of the maximum 
difference between distribution functions means of an average difference 
between their characteristic functions. This inequality allows us to quantify 
the speed of convergence of the CLT. 

The aim of this chapter is to extend these results to interacting particle 
approximation models. These fluctuation results provide precise asymptotic 
information on the various Lp mean error bounds and the propagation of 
chaos derived respectively in Chapter 7 and Chapter 8. The reader will And 
a deep study of topics such as the multidimensional CLTs for normalized 
and unnormalized particle approximation measures, extended versions of 
the theorems of Berry and Esseen amd of Donsker to interacting particle 
models, and a fluctuation result for particle McKean measures on path 
space. While our approach to CLTs is well suited to analyze any suffi- 
ciently regular McKean interpretation model, to simplify the presentation, 
we restrict our study to McKean models of the form 

Kn+l,f/(®, •) = £nC^n(®) •) "f" ~ £n^n(®)) ^n+l(^) (^‘I) 

where £n are noimegative constants such that e„G„ < 1. As in Chap- 
ter 8, we note that the pair of examples provided on page 219 can be cast 
in the form above, and the case £n = 0 corresponds to the traditional 
mutation/selection genetic algorithm. As usual, we shall suppose that the 
potential functions G„ satisfy the traditional condition (G) introduced in 
page 115 for some sequence of parameters e„(G) > 0. 

The chapter is organized as follows. Section 9.3 and Section 9.4 focus on 
multidimensional CLTs for particle density profiles. We examine the par- 
ticle approximation models associated with unnormalized and normalized 
Feynman-Kac measmes. We design an elegant strategy based on martin- 
gales and semigroups techniques which allows us to reduce the fluctuation 
analysis to local sampling errors. In Section 9.5 we study the rate of con- 
vergence of these CLTs and provide a Berry-Esseen type theorem for ab- 
stract martingale sequences and interacting processes. We provide simple 
regularity conditions on the increasing processes underwhich we can de- 
rive precise estimations of the characteristic functions. In Section 9.6, we 
discuss the fluctuations of particle random fields and prove an extended 
version of Donsker’s theorem for interacting particle models. We develop 
an empirical process technique based on sub-Gaussian maximal inequalities 
to prove the asymptotic tightness of random field errors. All of these fluctu- 
ation results for interacting processes are restricted to uniformly bounded 
classes of test functions. Some extensions to imbounded functions can be 
found in [89]. We emphasize that the strategy for CLTs presented in Sec- 
tion 9.4 to Section 9.6 is not related to any kind of regularity conditions on 
the mutation transition M„. Thus, it applies to path space and genealogi- 
cal particle models. In Section 9.7, we analyze the fluctuations of particle 




9.2 Some Preliminaries 293 



McKean measures under the regularity condition (M)**** introduced on 
page 116. The approach is essentially based on a theorem of Dynkin and 
Mandelbaum and a formula of Shiga and Tanaka. In the hnal section, Sec- 
tion 9.8, we provide an explicit description of the inverse of the Lj integral 
operator which characterizes these fluctuations on path-space. 



9.2 Some Preliminaries 

In the context of interacting particle models, the local errors induced by 
the approximation sampling transitions of each particle can be interpreted 
as small disturbances in the time evolution of the particle density pro- 
files. To be more precise, let us consider the particle approximation model 
= (C)l <i<N G e!!, n € N, associated with a nonlinear measure-valued 
equation of the form 



Vn — *?n— ^ '^{En) (9-2) 

where Kn,ri is a collection of Markov transitions from a measurable space 
En-i into another The nth sampling error is the M(£„)-valued random 
variable defined by the formula 

= (9.3) 

Notice that is itself the sum of the local errors induced by the random 
elementary transitions of the N particles; that is, we have 

C = J = E A,v;r 

t=l 

with the “local’’ terms given for any (pn € Bb{En) by 

= :^1*>«(C) - 'fn,,."., (S'*) 

By the definition of the particle model is the empirical measure as- 
sociated with a collection of conditionally independent random variables 
^ with distributions •)• F^om this simple observation, we 

readily find that E(V^((^n)) = 0 8uad 

In addition, for sufficiently regular McKean interpretation models, the 
asymptotic results developed in Section 7.4 apply, and we find that 

lim E{V^{ipn)^) = Vn-l{Kn,Vn-i[<Pn ~ En,f,n-yi<Pn)]^) 




294 9. Central Limit Theorems 



More precise regularity conditions are provided in the preliminary section, 
Section 9.3. Roughly speaking, the results above indicate that the random 
particle sampling error is globally imbiased, it has a finite variance, and 
the asymptotic limiting variance is known. Rirthermore, the formula (9.3) 
also shows that the particle density profiles satisfy “almost” the same 
equation (9.2) as the limiting measures To get one step further in our 
discussion, we introduce the following double triangular array: 



AiVj^(v^o), AjyJ'Cv’o), .. 


AivVo^(^o) 




AatVi^(¥>i) 


AiK,^(¥>n), A2V;r(¥>n), .. 




The variables (AiV^(vJn))i<«<JV on each nth row are independent condi- 



tionally with respect to the rows with levels p < n. In addition, and in view 
of (9.4), each term A<V^(^p) is “negligible” in comparison with the sum 
V^(y)„). Applying the classical CLT for triangular arrays (see for instance 
Theorem 4 on p. 543 in [287]), we find that the sum of the terms 
of each row converges in law as TV -> oo to a Gaussian random variable 
Vni^Pn) such that 



E(K(¥^n)) = 0 and E(V„(¥)n)") = »7n-l(i^n.,„_, [Pn ~ (v’n)]") 

(9.5) 

These elementary observations are the first steps to our study of CLT for 
particle models. In Section 9.3, we will make precise this fluctuation result 
and will prove that the complete random sequence (<pp))p=o,...,n formed 
by the sums of each row converges in law to a sequence of independent 
and Gaussiem random variables (V),((pp))p=o,...,n with means and variances 
prescribed by the formulae (9.5). In this preliminary section, we also present 
some elementary tools to transfer CLTs such as the Slutsky technique and 
the so-called 5 -method. 

Although these elementary fluctuations give some insight on the asymp- 
totic normal behavior of the local errors accumulated by the sampling 
scheme, they do not give directly a CLT result for the difference between 
the particle measures or and the corresponding limiting measures 
and 7 n. Nevertheless, as the reader may have certainly noticed, the mar- 
tingale decompositions exhibited in Proposition 7.4.1 in Section 7.4.2, are 
expressed in terms of the sequence of local errors ■ In Section 9.4, we 
shall combine this important observation together with the approximation 
tools provided in Section 9.3 to analyze the fluctuations of or 7 ^. 




9.3 Some Local Fluctuation Results 295 



9.3 Some Local Fluctuation Results 

Let = {!F^ ; n > 0} be the natural filtration associated with the 
iV-particle system The first class of martingales that arises naturally in 
our context is the R‘*-valued and J^^-martingale (/) defined by 

<(/) = E [<(« - *p«-i)(/p)l (9 «) 

p=0 

where fp : Xp € Ep >-¥ fp{xp) = (/p (xp))u=i,...,d € is a (j-dimensional 
and bounded measurable fimction. By direct inspection, we see that the 

vth component of the martingale M^{f) = (M,J^(/“))u=i d is the d- 

dimensional and F^-martingale defined for any u = 1, . . . , d by the formula 

p=0 

p=0 

with the usual convention = Vo = ^o{V-i) for p = 0. 

Most of the results presented here are based on the following CLT for 
the martingale M^{f). 

Theorem 9.3.1 For any d > 1 and for any sequence of bounded mea- 
surable functions fp = (/p )u=i,...,d 6 R*^ and p > 0, the R‘*-vo/ued and 
-martingale y/N Mff{f) converges in law to an R^-valued and Gaus- 
sian martingale M„{f) = (M„(/“))„=i,...,d such that for any n > 0 and 
I < u,v < d 

(M(/“),M(r))„ 

5Zp=0 Vp-l[^p,tlp-i {{fp ~ ^P,Vp-lfp) {fp ~ ^P,r}p-ifp))] 
with the convention = vo forp = 0. 

Proof: 

The idea of the proof due to Jean Jacod consists in using the CLT for 
triangular arrays of R‘*-valued random variables (Theorem 3.33, p. 437 
in [189]). We first rewrite the martingale y/N M’f{f) in the following form: 

'/N mhu ) = EE ^ ,(/,)(«-.) ) 

t=l p=0 ^ 

This readily yields y/N Mff{f) = ^kif) where for any 1 < k < 

(n + l)iV with k — pN + i for some i = 1, . . . ,N and p = 0, . . . , n 

^ ( fpid) - ,(«($-.) ) 




296 9. Central Limit Theorems 



We further denote Qj^ the <r-algebra generated by the random variables 
for any pair index {j,p) such that pN+j < fc. It is readily checked that, 
for any 1 < u < w < d and for any 1 < fc < (n + 1)^ with k = pN + i for 
some i = l,...,N and p = 0, . . . , n, we have \ Gj^-i) = 0 and 

“ ~ ^p<np-i^p^ ^^p ~ 

This also jrields that 

— Vp-i[^p,VpS^^p ~ ^p<n"-Jp^ (/p ~ ^p.Vp-i^p^^ 

Our aim is now to describe the limiting behavior of the martingale 

\/N in terms of the process Xl^{f) **=' U^{f). By the 

definition of the particle model associated with a given mapping and 
using the fact that = [t], one gets that for amy l<u,v<d 

\Nt]+N 

k=i 



= (/“, n + +1 (/“. /”) - Cw (/“' /")) 

where, for any n > 0 and 1 < «, v < d, 

c^{f\n 

= 53p=o ^p-i [•^p.vJLi ( ( -^p “ ^p<n"-Jp ) ( -^P ~ ^p.vJLi/p ) )] 
This implies that for any 1 < t, j < d, 

(M]+N 

fc=i N oo 



with 



Cn{r,n 

= Ep=0 »?P-i[^P.Vp-. {{f^- i^P,Vp-./p“ ) ( /p" - ^P.v.-Jp ) )1 
and for any t G R+ 

Ctir, n = qt] (/“, n + {o (qti+i ir,n - qti (r, n) 




9.3 Some Local Fluctuation Results 297 



Since ||C/fc^(/)|| < ^ (Vp<n||/pll), for any 1 < A: < [Nt] + N, the condi- 
tional Lindeberg conation is clearly satisfied and therefore one concludes 
that the R‘*-valued martingale {X^{f) ; t e R+} converges in law to 
a continuous Gaussian martingale {Xt(/) ; f G R+} such that, for any 
1 < u,u < d and t € R+, (X(/“),X(/»’)>t = Recalling that 

X[^(/) = VN the proof of the theorem is completed. ■ 

A first consequence of Theorem 9.3.1 is a CLT for the random vector 
= {Vo^{<Po ), . . . , vf (¥>n)) G R<^ X . . . X R*^ 
defined for any 0 < p < n by 

= Vw , (v>,)l 

with (p = ((Pn)n>o and <Pn e Bb{EnY", dn > 1, for all n e N. Loosely 
speaking, and as we already mentioned in the introductory section the 
next corollary expresses the fact that the local errors associated with the 
particle approximation sampling steps behave asymptotically as a sequence 
of independent and centered Gaussian random variables. 

Corollary 9.3.1 The sequence of random fields = (V^^)o<p<n con- 
verges in law, as N -¥ oo, to a sequence V„ = (Vp)o<p<n of (n -I- 1) in- 
dependent and Gaussian random fields Vp with, for any € Bb(Ep), 

E(Vpi<pl)) = Oand 

E(Vp(<pl)Vp(<p^^)) 



Proof: 

Let (pn be a sequence of bounded measurable functions in Bb{En)^- We 
associate with <p = (v?n)n>o the sequence of functions 

/p = (/p . • • • . /p”^‘) e Bb{Ep)<^' X ... X Bb(Ep)“’'^^ 

0 <p<n on Ep defined for any 0 < p, « < n by 

/“+i = ip(u)^pGe6(f;p)'^ 

By construction, we have for any 0 < p < n 

VN = Vn [rj^ipp) - <_,i^p.,N ^(^p)] = V^^ipp) 

and therefore 

^^N Ml! if) = (VN M„^(/p+1))o<p<„ = (Vf (v>p))o<p<n = VUip) 




298 9. Central Limit Theorems 



By Theorem 9.3.1, we conclude that converges in law to an (n + 1)- 

dimensional and centered Gaussian random field V„(<^) = (V^(y’p))o<p<n 
with, for any 0 < p, g < n, 

= {M(/p+L1),m(/9+i-2))„ 



= fp(9) Vp-i[^p,vp-i i^Pp ~ ^p.Vp-iPp) ^p,Vp-i {fp ~ ^p,vp-i'Pp)] 

This ends the proof of the corollary. ■ 

The next two lemmas provide some important approximation tools that 
can be used in conjimction with Corollary 9.3.1. 

Theorem 9.3.2 (Slutsky) Suppose that {V^)n>i and{V 2 ^)N>i are ran- 
dom variables taking values in some separable metric space {E,d). IfVl’ 
converges in law, as N oo, to some random variable Vi, and d{V^, V^) 
converges to 0 in probability, then converges in law, as N oo, to Vi. 

Proof: 

For any <5 > 0, we let = {x £ E : d{x, A) = infpg ,4 d{x, y) < 5} be the 
closed blowup of a closed subset Ac E. Observe that 

P(Vi^ e .4) 

= P(V 2 ^ e A, d(Vi^, < (5) + P(Vij^ e A, d(Vi^, > 6) 

< P(Vi^ e .4^) + P(d(Vi^, > S) 

Under our assumptions, we deduce that 

limsupP(Vi^ € i4) = limsupP(V 2 ^ € A) 

N-^oo N-^oo 

< limsupP(Vi^ e < P(Vi € A^) 

N-¥oo 

Letting J 0, we conclude that limsup;yf_^ooP(V 2 ^ € A) < P(Vi G A) for 
any closed subset A. This clearly ends the proof of the theorem. ■ 

Recalling that the convergence in law to some constant implies the conver- 
gence in probability, we have the following corollary. 

Corollary 9.3.2 Suppose that converges in law to some finite constant 
v\ and V 2 converges tn law to some random variable V 2 . Then {V^, V^) 
converges in law to (ui,V 2 ). In addition, ifV^ converges in law to some 
finite constant vz, then converges in law to U 1 V 2 -t- vz- 

We also quote without proof the following traditional theorem. 




9.3 Some Local Fluctuation Results 299 



Theorem 9.3.3 (continuous mapping theorem) Let be a sequence 
of random variables taking values in some metric space {Ei,di). Also let F 
be a mapping from E\ into an auxiliary metric space {E^,d 2 )- Assume that 
converges in law to some random variable V. If F is almost continuous 
with respect to the distribution ofV (i.e., P(V 6 C{F)) = 1, where C{F) is 
the continuity set of F), then F{V^) converges in law to F(V). 

Theorem 9.3.4 (Skorohod) Let P and {Pn)n>i be a collection of prob- 
ability measures on some separable metric space E. Suppose that Pn weakly 
converges to P as N tends to infinity. Then there exists a probability space 
and a collection of random variables X and {Xn)n>i defined on 
it such that (PoX”',PoXj^') = (P,P^) and limAf_>oo X^ = X, P almost 
surely. 

The following useful lemma is also known as the (^-method. 

Lemma 9.3.1 (^-method) Let {Uq, ..., UI^)n>i be a sequence 
valued random variables defined on some probability space and {up)o<p<n 
be a given point in . Suppose that 

(9.7) 

converges in law, as N oo, to some random vector {Uq, . . . , t/n)- Then^ 
for any differentiable function Fn : -> R a< the point {up)o<p<n the 

sequence 

VNlFniU^iu), ..., - F„(uo, • • • , «„)] 
converges in law as N oo to the random variable - Up, 

Proof: 

By Skorohod’s theorem, in it is legitimate to suppose that all remdom 
variables are defined on a common probability space (f2, P) and (9.7) is 
the ordinary convergence in for each sample u £ Q. Since Fn is 
assumed to be differentiable and {{Up — Wp))o<p<n goes to zero, we have 
for each u 

y/N[Fn{Ui’{w), ..., Ul^iu)) - Fniuo , .... u„)) 

= Ep=o l£(«o, iu;) - Up) 

with lim/v-foo £n (‘*^) “ 0. Under our assumptions, we have for each u 
limAT^oc y/NlF^iU^'iw ), .... - F„{uo , ..., «„)] 

= Hp=0 • • • ’ “n) ^pi^) 

This ends the proof of the lemma. ■ 




300 



9. Central Limit Theorems 



9.4 Particle Density Profiles 

This section is concerned with the fluctuations of the particle approxima- 
tion measures 7 ^ and For a precise description of the Feynman-Kac 
semigroups involved in this study, we refer the reader to Section 7.2 or 
Section 2.7. 



9.4- i Unnormalized Measures 

We recall that the “unnormalized” approximation measures 7 ^ are deflned 
for any ipn € Bb{En) by 



n — 1 



In (<Pn) = 7n (1) Vn (<Pn) with 7 ^ (1) = JJ Vp (Gp) 

p=0 

By Proposition 7.4.1, the real-valued process deflned by 

rf[n(¥*n) •' P ^ {0) • • • ) tt} — > (V’n) = 7^ (Qp,n¥’n) ~ 7p {Qp,n^n) 

has the .F^-martingale decomposition 

(1) K (0..»v>n)] 

fl =0 

and its angle bracket is given by 

(^n)) = -w ^ 7^ (1) V^-l(^q,t}^_,lQg,n<Pn ~ ^q,t}^_^Qg,n¥>n]^) 



g=0 



with the convention for g = 0, = $ (g~i) = = »?o- Let Fp 

0 < p < n, be the random sequence defined as in (7.10) by replacing in 
the summation the terms 7^(1) by their limiting values 7,(1). In order to 
combine the CLT stated in Corollary 9.3.1 with the <J-method stated in 
Lemma 9.3.1, we rewrite the resulting random sequence as 



^^Kn(<Pn) = y/N(7!^(<Pn)-7n(<Pn)) 

n 

q=0 

= v/iVF„(C7o'!„,-..,0 

with the random sequence 0 < p < n, and the function Fn given by 

n 

= ^p"'(Qp.n¥’n)/v/iV and Fn(vo,...,V„)’'^ '£ 7 g(l) V, 



q=0 




9.4 Particle Density Profiles 301 



Since for any n > 0 we have liniAr_+oo = 7n(l) in probability, we 

easUy deduce from corollary 9.3.1 and Lemmas 9.3.2 and 9.3.1 that the 
real-valued random variable VN {-fn{<Pn) - 7n(¥’n)) converges in law to 
the Gaussian random variable X^^_o 79 (l) V^(Qp,nV>n)- The extension of 
this result to multidimensional sequences is proved as before using Slutsky’s 
theorem or the same techniques as used in the proof of Theorem 9.3.1. 

Proposition 9.4.1 For any n > 0, d > 1, and f € the sequence 

of d-dimensional and -martingales '/NTp n{f), 0 < p < n, converges 
in law to an R'^-valued and Gaussian martingale Tp^nif), 0 <p<n, such 
that for any I <u,v <d, and 0 < p < n 

(r.,„(r).r„„(r)). = E^(7,(i))= 

^Vq-1 [Qq,nf'^ ~ ^q,flq-iQ<}<nf'^] [Qq,nf'’ ~ 

One immediate consequence of this multidimensional CLT is that the se- 
quence of random fields 

=def. V^(7^^(^n)-7n(¥’n)), 

converges in law as IV oo, in the sense of convergence of finite-dimensional 
distributions, to a centered Gaussian field satisfying for any e 
Bs{En) 

E{w;nnw;i{f)) = (r..„(/i),r..„(/2))„ 

Remark 9.4.1 The martingale approach to CLTs we have developed can he 
extended to Feynman-Kac models with general nonnegative potential func- 
tions Gn us soon as the normalized constants 7n(l) ore strictly positive. 
More precisely, using Theorem 7.4- 1> vie can easily prove that the CLT 
stated in Proposition 9.4-1 holds true for the martingale y/NY^^{f) intro- 
duced in (7.10) in Proposition 7.4-1. 

9-4 2 Normalized Measures 

Observing that for any n 6 N and / € Bb{En) we have 

and limAf_>oo7n(l)/7^(l) = 1, in probability. We easily prove, using Slut- 
sky’s Lemma (Lemma 9.3.2), that the sequence of real-valued random vari- 

= ViV (ri^if) - 7 ?„(/)) 

converges to the Gaussian random variable WJ given by 




302 9. Central Limit Theorems 



To extend this CLT to multidimensional sequences, we follow the same 
arguments as those used in the proof of Proposition 9.4.1 by replacing 
by the semigroup Qp „ defined by 



Qp,n i^n) 



Qp,n(<Pn) _ 7p(^) 
VpQp,n(^) 7n(l) 



Qp,n(<Pn) 



To clarify the presentation, it is convenient to introduce some additional 
notation. For any 0 < p < n and / = (/^, . . . , € Bh{EnY, we write 



fp,n — Qp,n if Vnf) (9-8) 



Proposition 9.4.2 For any n > 0 and / = € Bh{EnY, the 

R'^-valued process Wp{fp^n), 0 < p < n, given by 



W^'^{U,n) = y/Nr,^{fp,n) 

converges in law to an R^-valued Gaussian martingale W^(/p,n)» 0 < p < 
n, such that for any I <u,v <d, and 0 < p < n 

{WHfln),W^if:,n) )p 






Proof: 

For any (p = ip’^) e Bb(EnY, we have the decomposition 

Vp (Qp,n<p) = Vo iQo,n<P) + XI [VqiQq,n'P) ~ Vq-liQg-l,n<P)] 

4=1 

If we choose ip = {f - rinf) with / = (/\ ...,/**) € BbiEn^, this yields 
<(/p.n) = B^iU) + Mp^(/.,n) (9.9) 

where 

)(/,,.) (9.10) 

q=l 
4=0 

with the usual convention $o(P-i) = Vo- Since for any 0 < g < p 

Vq (fq,n) = Vn(f ~ Vnf) = 0 and p,_i (Q,_i,,l) = T? 4 (l) = 1 



(9.11) 




9.4 Particle Density Profiles 303 



then (9.10) can also be written in the form 

B^iU) 

9=1 

Using the error estimates presented in previous sections, one gets after some 
tedious but easy calculations 



NE 



sup ||5^(/.,n)|| 



< b{n) 



(9.12) 



for some finite constant b{n) < oo, which only depends on the parameter 
n (we recall that ||/|| = X)i=i ll/’ll for any / = (/*,...,/'')). By Theo- 
rem 9.3.1, we know that the F^-martingale ^/N converges in 

law to the desired Gaussian martingale W^(/.,n)* Arguing as in the proof 
of Theorem 9.3.1 and using (9.12), we easily complete the proof of the 
proposition. ■ 



94-3 Killing Interpretations and Related Comparisons 

One of the best ways to interpret the fiuctuation variances developed in 
Section 9.4.1 and Section 9.4.2 is to use the Feynman-Kac killing inter- 
pretations provided in Section 2.5.1. In this context Xn is regarded as a 
Markov particle evolving in an absorbing medium with obstacles related to 
[0, l]-valued potentials. Using the same notation and terminology as was 
used in Section 2.5.1, the Feynman-Kac semigroups (Qp,n»^p,n) have the 
following interpretation 



Qp,n {^p ) dXfi ) — •^p+l(®p> dXfi) 



= e dx„, T>n) 



and, for any XpeEp = Gp ^((0, 1)) 



Pp,n{^pt(ECn) — 6 dXn | T" ^ u) 

where IP^ represents the distribution of the absorbed particle evolution 
model starting at Xp = Xp at time p. In this context, the variance of 
the fluctuation variable W^(l) associated with the McKean interpretation 




304 9. Central Limit Theorems 



model /r„,,(x„_i, .) = is given by 



— 7n(l)^ l^p=0 ([^ ~ Qp,n(l)/f/pQp,n(l)] ^ 



— ^ Sp=0 /e, P^(-^P ^ \T >p) [pc(j’>f„ I T>p) ^ 

We further assume that for every n > p and 7/p-a.e. Xp, yp € Ep, we have 
P^,x,(7’>n)>JP^.P^(T>n) 

for some <5 > 0. By Proposition 4.3.3, this condition is met as soon as the 
conditions (G) and {M)m introduced on pt^e 115 hold true for some integer 
m > 1 and some pair parameters (€„(G), en{M)) such that e{G) = A„e„(G) 
and e{M) = A„e„(M) > 0. In this case, we have 

E(W^(1)2) < b{S) in + 1) r (T > n)2 



for some finite constant b{6) < oo. In general, the non absorption probar 
bilities 7n(l) = ^{T > n) are very small and rather difficult to estimate 
(see also lotion 12.2.5). The killing interpretation also suggests another 
approximation model based on N iid copies X* of the absorbed particle 
evolution model. 



K 



killing 



exploration 






The Monte Carlo approximation is now given by SiLi lr<>n> where T* 
represents the absorption time of the ith particle. It is well-known that the 
fluctuation variance of this scheme is given by 

^MC(i )2 ^ > n) (1 - r(T > n)) 



Rrom previous considerations we find that 

gMC(i )2 ^ 1 l-F(T>n) 

E(W,?(1)2) - 6(<5)(n + 1) F(r > n) 
as soon as P^(T > n) = o(l/n). 

If we choose the McKean transitions K„,ij(x„-i, .) = ^niv)^ variance 
of the random field can also be described for any and /* G Bb{En) 
as 

E{w;nnw;iif)) = 

p=0 

= E ’ll. (l9,.n(/‘ - 1./')! - >fe/")l) 

p=0 

(9.13) 




9.4 Particle Density Profiles 305 



This covariance function can be formulated using the potential functions 
Gp.n and the Markov transitions Pp,„. More precisely, since for any / € 
Bb{En) we have 



Qp,n if - Vnf) = Qp,„(l) 



( QpAf) 

Wp.n(l) 



-Vnf 



G'p,n 

r/p(Gp,n) 



{Pp,nf Vnf) 



we conclude that 




- Vnf) {Pp,nf - Vrxf) d»/p 



(9.14) 



If the Mwkov kernels Af„ are trivial in the sense that Af„(a:„_i, .) = A^n 
for some /i„ € ViEn), then one can readily check that »jn = Hn and 
Pp,n{xp,dxn) = fin(dx„). In this particular situation, is the classical 
/x„-Brownian bridge. Namely, is the centered Gaussian process with 
covariance 



E = ^ln [if - Hnf)if - finf)) 

If we choose the McKean transitions 

Kn,T,{x,dy) = G„-i(x) Mn{x,dy) + (1 - G„_i(x)) $„(»j)(dx) 
then, using the fact that Vqifq,n) = we first observe that 

~ G^9-1 ^qifq,n) 

In this situation, we conclude that the varimce of the random field is 
defined for any /* and p € B(,{En) by the formula 



= E;.o Ip (/p.p/?,») - E;., 1 p-.(C^-i Mp(/;,„)Mp(/?,„)) 

If we take f = p, it is readily seen that the variance of the corresponding 
CLT is strictly smaller than the one associated with the McKean interpre- 
tation Kfj,»j(Xn-l, •) — ^n(^)' 

Since we have 



fp,ni^p) ~ 



Gp,n(jp) f 
T/p(Gp,n) JBp 



{PpAf)i^p)-PpAf)iyp)) 



Gp,njyp) 

»7p(Gp,n) 



Vpidyp) 



we readily observe that l|/p,„|l < Tp,„ 0 {Pp,n) with the pair of parameters 
i‘rp,n, 0 {Pp,n)) introduced on page 218. Arguing as in the proof of Theo- 
rem 7.4.4 we obtain the uniform estimate 



SMpE {w;i{p)^) < m/(e^(M) e(G) 2 "*-‘) 




306 9. Central Limit Theorems 



as soon as conditions (G) and (M)m hold true for some integer m > 1 
and some pair parameters (en(G),£n(M)) such that e(G) = An€n(G) and 
e{M) = A„£„(M) > 0. 



9.5 A Berry-Esseen Type Theorem 



In earlier sections, we have established various CLTs for the particle ap- 
proximation measures 7 ^ and r;^. These results lead inevitably to the 
question of the rate of convergence of these fluctuations. In the present 
section, we analyze the CLT and the rate of convergence of the correspond- 
ing fluctuations for martingale sequences. The following theorem, which we 
give without proof, is due to Berry and Esseen (for a detailed proof, we 
refer the reader to any textbook on applied probability, such as [55]). It 
provides a simple way to estimate how close two distribution functions are 
in terms of their characteristic functions. 

Theorem 9.5.1 (Berry-Esseen) Let (^ 1 ,^ 2 ) be a pair of distribution 
functions with characteristic functions (/i,/ 2 ). Also assume that F 2 has 
a derivative with ||^|| < 00 . Then^ for any a > 0, we have 



2 r 

\\Fi-F2\\<- 

7T Jo 



|/l(x)-/2(x)| 

X a'K 



m 

dx 



In a CLT, we have a given sequence of distribution functions Ff^ that con- 
verge as iV — > 00 to a centered Gaussian distribution function F 2 with a 
prescribed varian ce, say As an aside, we note that in this case we have 
that II ^11 = 1/V^27 t^. Theorem 9.5.1 reduces the study of the rapidity of 
convergence in the CLT to the estimation of an average difference between 
the corresponding characteristic functions. In the case of iid random vari- 
ables (C*)»>i with mean zero, imit variance, and finite third moment, the 
Berry-Esseen inequality implies that 



sup 

xeR 





In the display above and in the further development of this section, the 
letter c represents any flnite constant whose values may vary from Une to 
line. The method for obtaining the estimate above is based on Taylor’s 
expansions of characteristic functions. Several extensions to not necessar- 
ily identically distributed random sequences, such as martingale sequences 
C*, can be foimd in the literature; see for instance [55], p. 228 and [54], 
Section 9.3, p. 318. The extension to interacting particle models is a little 
more involved. Next we present a rather general strategy based on mar- 
tingale techniques and on the following technical lemma, whose proof is 




9.5 A Berry-Esseen Type Theorem 307 



based on Stein’s approach for CLTs, which is not discussed here; thus its 
proof will be omitted (see for instance Lemma 1.3 in the textbook by G.R. 
Shorack [288]). 

Lemma 9.5.1 Let Fz be the distribution function associated with a real- 
valued random variable Z, and let W be a centered Gaussian random vari- 
able with unit variance. For any pair of random variables {X,Y), we have 

||Fx+y - FwW < ||Fx - Fvvil +4E(|Xyi) +4E(|y|) (9.15) 

Loosely speaking, the lemma above is well-suited to deduce Berry-Esseen 
estimates for stochastic sequences of the form -t- when the corre- 
sponding statement has been proved for X^ and the random “perturba- 
tion” y ^ tends to zero “suflSciently fast” . 

We let = {Fn)n>o,^^)y > 1, be a collection of probability 

spaces with a distinguished nondecreasing collection of cr-algebras C 
Fn+i, n > 0. For a given sequence of F^-adapted random variables Xj^, 
we write for each n > 0 



with the convention AXq = Xq for n = 0. We recall that a square in- 
tegrable and F^-martingale = {M^)n>o is an F^-adapted sequence 
such that E((M,J^)^) < oo for all n > 0 and 

E(M„^+i|:^^) = M„^ (P^-a.s.) 

The sequence of random variables (AM^)n>o is also called an F^-martingale 
diflFerence gmd the predictable quadratic characteristic of is the se- 
quence of random variables < >= (< >n)n>o defined by 

<M^ >„=f^E((AM„^)2|jr„^_i) 

p=0 

with the convention E((AM^)^ j F^j) = E((M^)^) for n = 0. The 
stochastic sequence < > is also called the angle bracket of 

and is the unique predictable increasing process such that the sequence 
>n)n>o is an F^-m«ui;ingale. For each n > 0 and N >1, 

we write 

= N<M'^>n 

the angle bracket of the F^-martingale sequence {\/N M,J^)„>o. The typical 
situation we have in mind is the one-dimensionail F^-martingale 

^nif) = j2 [<(/p)~ W-l)(/p)] 

p=0 



(9.16) 




308 9. Central Limit Theorems 



introduced in (9.6), where fp : Xp e Ep fpi^p) € R, p > 0, stands 
for some collection of measurable and bounded functions. No generality is 
lost and much convenience is gained by supposing (as will be done in this 
section) that the test functions fp are such that ||/p|| < 1 for all p > 0. 
Note that in this situation = (t($ 0 ) • • • .^n) is the filtration generated 
by the particle model and we have that 

N (M^(/))„ = CHif) = n^-i[Kp,^sJ{fp - K^,r,s_Jpf)\ 

p=0 

Also note that, in the first case McKean interpretation defined on page 219 
we have 

= (9.17) 

while in the second case 

ACi'(Z) = )(/„") )(/.)"! 

-rf-1 |G5-l(Mn(/n) - 

Our approach to the Berry-Esseen theorem for .F^-martingale sequences 
is based on the following conditions: 

• (ffl) For any n > 0, there exists some constants ai(r^< oo and 
0 < Ci(n) < 1 such that for any n > 0 and A® < Ci(n) y/N we have 

|E( j _ J| < ai{n)X^/y/N (P^-a.s) 

• {H2) For emy n > 0, there exists some finite constant 02 (n) < oo 
such that for any iV > 1, A > 0, and n > 0 

eA»a,(n)/^ 

• (H3) There exists a nonnegative and strictly increasing deterministic 
process C = (Cn)n>o as well as some finite constants 0 < 03 (n) < od 
such that for any e > 0 we have 

E(e'^l^^n -^<^’>1) < (1 +ea3(n)) 

We readily observe that |E(e*'^'^*^’^ )| < E(|E(e*'^'^^*^^ l>^,J^-i)l) from 
which we find that condition (H2) is met as soon as we have the following 
almost sure estimates 




9.5 A Berry-Esseen Type Theorem 309 



Conditions {HI) and {H2) are rather classical. They are usually checked 
using simple asymptotic expansions of characteristic functions. The regu- 
larity condition {HZ) is more tricky to check in practice. It can be regarded 
as «m exponential continuity condition on the increasing process C^. The 
next two lemmas illustrate these three regularity conditions and their con- 
sequences. Their proofs are rather technical and are housed at the end of 
the section. 

Lemma 9.5.2 The -martingale Mj^{f) defined in (9.16) satisfies con- 
ditions {Hj), j = 1, 2, 3, for some universal constants 

(ai(n),02(n)) = (01,02) 

and xvith the nonnegative increasing process C{f) = (C„(/))„>o defined by 
^n{f) = ’7p-l[^p,»jp-i((/p ~ ^P,*?p-j/p)*)l 

p=0 

08 soon as the mapping n -> Cn{f) is strictly increasing. In addition, the 
constant 03(71) in {HZ) can be chosen such that for any n > 0 

n 

0 < 03(71) < rp,„ fi { Pp , n ) 

p=0 

with the collection of parameters (rp,„,^(Pp,„)) introduced in (7.3). 

Lemma 9.5.3 Suppose we are given a sequence of -martingales = 
{Mff)n>o satisfying conditions {Hj) with j = 1,2,3. Then, for any n>0, 
there exists a finite constant a{n) < 00 , a positive constant b{n), and some 
N{n) > 1 such that for any N > N{n) and 0 < A < b{n)\/N 

\E{e'^^^- ) _ Cn| < a(„) ^ 

Lemma 9.5.3 shows that whenever the regularity conditions {Hj), j = 
1 , 2, 3, are met, for any fixed time parameter n > 0 the sequence of random 
variables >/NM^ converges in law as 00 to a Gaussian random 

variable with 



E(M„) = 0 and E(M^) = C„ 

In the context of particle models, we readily deduce from Lemma 9.5.2 that 
the sequence of random variables y/NMff{f) converges in law as N -¥ 00 
to a centered Gaussian random variable M„{f) with 

E(M„(/)2) = Vp-i[Kp,r,,.,{{U - Kp,„^.Jp)^)\ 

p=0 




310 9. Central Limit Theorems 



as soon as C„(/) > 0 for any n > 0. This result is clearly much weaker 
than the multidimensional CLT presented in Theorem 9.3.1. Nevertheless, 
by a simple application of Theorem 9.5.1, we find the following fluctuation 
decays. 

Theorem 9.5.2 Let he a sequence of ^^-martingales 

satisfying conditions (Hj) with j = 1,2,3 for some nonnegative and in- 
creasing process C„. Also let be the distribution function of the random 
variable and let be the distribution function of a centered Gaus- 

sian random variable with variance Cn- Then, for any n > 0 there exists 
some N{n) > 1 and some finite constant c(n) < oo such that for any 
N > N{n) we have 

||F„^-F„||<c(n)/VN 

Proof; 

By Theorem 9.5.1 and Lemma 9.5.3, we have for any N > N{n) 



< 

for some N{n) > 1 and some finite positive constant 0 < b(n) < oo. This 
ends the proof of the theorem. ■ 



2a(n) 



^ f 
7T Jo 



b(n)VN 



e~^ AC„ + d\ 
24 



b{n)\/2en^C„ 

2a(n) f°° _Ai A/^ . 4 



TI" Jo 



b{n)y/U: 



In the context of particle models, we conclude that for each n > 0 the 

TV 

distribution function F„ of the normalized random variables 

weakly converges to the distribution functio n F„ o f the centered and nor- 
malized Gaussian random variable Mn{f)/y/Cn{f) and for any N > N{n) 
and some N{n) > 1 

^^N\\F';l-F4<b{n) (9.18) 

One strategy to deduce a Berry-Esseen estimate for the fluctuations of 
the particle density profiles is to use the semimartingale decomposition 
presented in the proof of Proposition 9.4.2. More precisely, we fix a time 
horizon n > 0 and we associate with the test function / € Bb{En) the 
sequence of functions (/p,n)p<n defined by 



/p.n = Qp.„(/-W)GB5(Fp) 




9.5 A Berry-Esseen Type Theorem 311 



with the normalized Feynman-Kac semigroup Qp „ given for any G 
Bb{En) and Xp e Ep by the equations 

Qp,n(‘Pn) = ^ 8nd Qp,n{fn){^p) — Ep.ip JJ Gq{Xq) 

Now we recall from (9.9) that 

W^'^{fp,n) =def. V^«(/p,n)-T/p(/p.n)] 

= VNB^ {!.,„) + ^^NM^ if 

with the .F^-martingale sequence (/.,„) defined by 

2 K(/,,n) - ^,«-l)(/,,n)] 

9=0 

and the .F^-predictable sequence S^(/.,n) defined for any p < n by the 
formula 

= Yi k-lWfl-l,?!) - »lf-l(Q9-l,9l)] [%iVq-l)Un - ^qiVq-l)fq,n] 
< 7=1 

Note that for p = n we have /n,n = (/ ” Vnf) and ^ 

w;i'^(f„,n) = y/N[v!!{f)-vn{f)] 

Theorem 9.5.3 For any n > 0, there exists some N{n) > 1 and some 
finite constant b{n) < oo such that for any N > N{n) 

sup P(lV„"’'"(/„,n) < Uy/Cjj)) - 4= r ^ 

u6R vZtt J-oo vN 

Proof: 

By arguments that should be now familiar to the reader, we find that 
ArE(|B^^(/.,„)|2)i/2<6(„) and N E(|B^(/.,„)|) < 6(n) 

By the definition of the martingale term, we also easily check that 
^/NE(|M„^(/..„)P)l/2<6(n) 

To apply Lemma 9.5.1, we set 

X = y/NMl^{f.,„)/^/CJJ) and F = (/.,„) 




312 



9. Central Limit Theorems 



Prom previous estimates, we deduce that 

E(|A-y|) < ^E(lM„^(/.,„)l2)V2E(lBir(/..n)n'/" < b{n)/^/N 

and EdVI) < b{n)/^/N. The end of the proof is now a simple consequence 
of (9.15) and (9.18). ■ 

We now come to the proofs of Lemma 9.5.3 and Lemma 9.5.2. 

Proof of Lemma 9.5.3: 

Let be the function defined for any A > 0 by 



/^'(A) = E(e‘"'^*'"+^^’*)-l 



We have the easily verified recursive equations 

/^"(A)-ei(A) 

x[E(e<^v/3VAM„^+A^AC„^ _ i]|g^(AC„-AC^')j) 

c„-,[g^(Ac„-AC^') _ ij) 



Using this, we obtain 

W)-ei(A) 

< [ E(|E(e‘^^^*^n ^ g^(AC„-AC~)) 

+E(e^ |AC„-ACj:'| _ 1) J 




9.5 A Berry-Esseen Type Theorem 313 



Under conditions (HI) and (i?3), we find that 






+ (l + e^“3(«) _ ij 



+ (e^“3(”) - 1) + e^“3(")j 

for any 0 < < ci(n) \/]V. Since for these pairs of parameters (A, N) we 

have A^ < \/N (and therefore X* < N), we find that 

|/n"^(A) - ei(A)| < d{n) X^{1 + X)/y/N 

for some finite constant d{n) < oo whose values only depend on aj(n), 
1 = 1,3, and such that 

d{n) = ce * (1 V Oi(n) V fl 3 (n))^ 

If we set 

c*(n) = A^^QCiip) (< 1) and d*(n) = 
then for any 0 < p < n and any 0 < A* < c*(n) y/N we have that 

\Ip{X) - I^-x(X)\ < d*(n) A^Cl + X)/Vn 

It is now easily verified firom these estimates that 

\InW\ < (n + l)d*(n)e'^ ^’•-‘A^Cl + X)/y/N 
from which we conclude that for any 0 < A^ < c*(n) VTV 

) - e-^ ^"1 < (n+ l)(f (n)-^(l + A) e~^ (9.19) 

y/N 

On the other hand, we have for any pair (A, N) 

|E(c‘^v^*^" ) _ c„ I < ^1 C„ (9 20) 

and under condition {H2) 

|E(gtAViVW~)| < gA»aa(n)/N/N 




314 



9. Central Limit Theorems 



Again using {H3)y we also find that 






) A^ao(n) 






Observe that for any pair (A, N) such that 



A < c*(n) y/N with c*(n) = [203 ^(n) A (2 ^ ACn(l -f 2 a 2 (n)) ^)] 



we have 

^[2ai(n) + a§(n)j^] < A(2<.,(„) + 1) < :^ 

This yields that 

|E(e«^«.“)| < [ 1 V (1 + e-^’^ (9.21) 

and hence, by (9.20) and for any A < c*(n)y/N, we find that 

To take the final step, we observe that for any 

N > Ci,{n)/c*{n)^ and < A < c*(n) VTv 

we have 1 = Ct{n)/Ci,{n) < c~^{n)X^/y/N and by (9.22) 

c„| < c;i(n)[ 2 Va 3 (n)] -^(1 + A) (9.23) 

y/N 

In conjunction with (9.19), we conclude that for any 
N > N{n) = c*(n)/c*(n)3 

8 uid any A < c*(n) y/N 

) - e"^ ‘^"l < a(n) -^(1 + A) e"^ 
y/N 



JXy/NMjfy 



< e-^ '^C'"[ 2 Vo 3 (n)l fl + 



with a(n) = [(n+ l)d*(n)] V [c* ^(n)( 2 V 03(71))]. This ends the proof of the 
lemma. ■ 




9.5 A Berry-Esseen TVpe Theorem 315 



Proof of Lemma 9.5.2: 

We first check that the regularity condition (H3) is satisfied. For the Mc- 
Kean interpretation model .) = $„(?;), we have 

ACnif) = 

= - ^niVn-l){fn? = Vn{f^) ~ Vnifn? 

and by (9.17) we easily prove that 

+ 2 |*n(-l„"-,)(/n)-*n(>?n-l)(/„)| 

By symmetry arguments, we have 

Applying Jensen’s inequality, we find that for any e > 0 

E(giN/iv|AC~(/)-AC„(/)|) 



< I ^n-,)+2E(W^ (/„)-„„(/„)! I 

Now applying the Cauchy-Schwartz inequality, we obtain 
£(g£vTv|AC;:'(/)-AC„(/)|) 

Using Girollary 7.4.3 (and recalling that /^/4 and /„/2 € Osci(£„) for 
any ||/„|| < 1), we conclude that 

E(e^l^^"(^)-^^"(^)l)< (l-Hea3(n)/N/i7) 

for some finite constant 03(71) such that 03(71) < /^(^9,n)- 

Using the same line of argument we prove that {HS) also holds true for 
any McKean interpretation of the form (9.1). To prove that (H2) is met, 
we first recall that 



I 



|E(gtAv/jvM^ )| < E(|E(e*-^'^^*^» 



(9.24) 




316 9. Central Limit Theorems 



Then we use a standard symmetrization technique. Given the particle 
model up to time p < (n - 1), we let be an auxiliary indepen- 
dent copy of In other words, is the empirical measure associated 
with an independent copy of the configuration of the S)r8tem at time 
n. With some obvious abusive notation, we readily check that 

where AM^ = (/„) - ^n(Vn-i)(/n)]- We deduce firom this that 

|E(giA>/57AAfi' I = JjE(e<;^I/"(€i)-/n«J.)] | 

i=i 

Since the random variables [fn(^i) - /n(ll)] and -[/n($i) - /n(ll)] have 
the same law, their characteristic functions are real and we have 

I = E (cos (;^[/n(^^) - /n(f„)]) | 

Using the elementary inequalities 

cosu< l-w^/2-f-|u|^/3! , H-u<e“, and |u - up < 4(|u|^ -I- |v|^) 
we prove that 






Multiplying over j, we obtain 

Ij.^giAvWAM^' I 

and by (9.24) we conclude that condition (H2) is met with a 2 (n) = c/2. 
We now come to the proof of (HI). By the definition of the particle model 
associated with a given collection of transitions we have 

E( giA^/iVAMj:'(/)+4ACi'(/) I 



= n,".i 

with the random function = (/„ - I^n,*/~_,(/n))(^n-i)- Using the ele- 
mentary inequality 

|e*-(l-t-z-t-zV2)|<e'*l |zp/3! 




9.5 A Berry-Esseen Type Theorem 317 



after some computations we see that for any A < y/N we have that 
^ j ^ ^ ^[ACl^if) - {fif] + <,(/) 

with |r^i(/)| < c X^/{Ny/N). This clearly implies that for any A < VN 

= 1 + (/^)“K-.)I + <i(/) 



with |r^2(/)l - ^ X^/{N'/N). It is now convenient to note that for any 
A< V^’ 



A* 



^|AC»(/) - +*(/) 



< c \/>/N 



On the other hand, for any \z\ <1/2 and with the principal value of the 
logarithm, we recall that 

u t 

log (1 + z) = z - / •; du = z- I - — — dt 

^ ’ yo i+« h 

Since for any \z\ < 1/2 and t € [0, 1] we have |1 +t«| > 1/2, we find that for 
any I'^l < 1/2 we have | log (1 + z) — < \z^\- FYom previous computation, 

we conclude that there exists some universal constant cq € (0, 1) such that 
for any A < co we have 

= ^[ACl^if) - + < 3 (/) 



with |r^3(/)| < c X^/{N^N). Summing over j, we see that for any A < 
CO y/N ’ 






i=i 



< c X^/y/N 



Finally, using the elementary inequality |e* — 1 | < |z|el*l, we conclude that 
for any A < co y/N 

|E(giAV5VAMi'(/)+4ACi'(/) I _ 1| < c 

VN 




318 9. Central Limit Theorems 



This readily implies that there exists some universal positive constant cj 
such that for any A® < c\ y/N we have 



|E(giAv/57AA/„^(/)+^ACi'(/) I _ i| < c 

y/N 

This proves that condition (HI) is met with ai(n) = c and ci(n) = 1. ■ 



9.6 A Donsker Type Theorem 

The random fields = y/N - rjn) introduced in Section 9.4 can 
alternatively be regarded as an empirical process indexed by the collection 
of boimded measurable functions. In this interpretation, the fiuctuation re- 
sults presented earlier simply say that the marginals of the 5j,(£^„)-indexed 
empirical process weakly converge to the marginals of a centered Gaussian 
process W^. To simplify the notation, we suppress the superscript (.)’’ and 
we write and Wn instead of and W^. 

In this empirical process interpretation, one natural question we may ask 
is whether there exists a functional convergence result for an .?^„-indexed 
empirical process / € Tn) where C Bb{En). We recall that 

weak convergence in can be characterized as the convergence of the 

marginals together with the asymptotic tightness of the process f € 

Wn if) € R. The asymptotic tightness is related to the entropy condition 
< 00 . 

Lemma 9.6.1 If is a countable collection of functions f such that 
ll/ll < 1 and I{!Fn) < oo, then the Tn-indexed process W^{f), f 6 .F„, is 
asymptotically tight. 

Prom previous comments, one concludes the following. 

Theorem 9.6.1 (Donsker) WTien condition (G) holds true and I{^n) < 
00, the empirical process 

W^:f€J^n^ Wj/{f) = y/N {v^if) - ,/„(/)) 

converges in law in 1°°{!F„) to a centered Gaussian process W„(/), f £Tn- 

Proof of lemma 9.6.1: Using the Cauchy-Schwartz inequality, we have 

E(jWn{f) - W„(/i)p) < c(n)||/ - 

for some constant c(n) < oo and all /, h € I^iVn)- In particular, to prove 
asjunptotic tightness it is enough to establish the asymptotic equicontinuity 




9.6 A Donsker Type Theorem 319 



in probability of (/))/€ Tn with respect to the semi-norm on !Fn given 
by 

(see for instance Chap. 1.5 and Example 1.5.10 on p. 40 in [311]). Since 
I{Tn) < 00 , the class !Fn is totally bounded in L 2 (r/n) for any n > 0. 
According to the preceding comment, it will be enough to show that 

(9.25) 

for all sequences 4- 0 where, for any (5 > 0 (possibly infinite), 

{/ - h; /, h e : 11/ - < 6} 

For this task, we use the traditional decomposition 

V^if) - Vnif) = E [^P.n«)(/) - ^P.n(^p«-l))(/)] 

p=0 



with 



^p,n(M)(/) ^P,n(^)(/) — J [(M(^p,n^p,n(/)) ^(^p,n^p,n(/))) 

+^P,n(/i)(/) (»?(Gp,n) - ^l{Gp,n))\ 

to prove that 



llrf - [ 11’’" “ 



where 



-^P.n(<J) = {Gp,nPp,n(/);/e:F(")(<5)}, 



Gp.n = Gp,n/||Gp,„||, and Pp,„ = f[^^p S HC')- Since for any 0 < p < n 

E{[v^{Gp,n) - $p«-i)(Gp,„)]") < ^ 
to prove (9.25) it suffices to check that for any 0 < p < n and i 0 




320 



9. Central Limit Theorems 



where / 6 Let us prove 1). Let £ = (£i)i>i con- 

stitute a sequence of independent and identically distributed with P(si = 
-1-1) = P(ei = -1) = 1/2. We also assume that e and the particle model 
(Cn)n>o are indep>endent. By the symmetrization inequalities, for any N, 

where m^($p) = Fix Cp = % the Chernov- 

Hoeffding inequality (see Lemma 7.3.2), the process / -+ VNm^{^p){f) is 
sub-Gaussian with respect to the norm || • ||i,j(,,iv). Namely, for any f,he 
Pp,n{SN) and 7 > 0, 

P (lVN(mf (^p)(/) - (^p)(h))| > 7 Up) < 

Using the maximal inequality for sub-Gaussian processes (see for instance 
[311, 217)), we get the quenched inequality 

E (||m^ (^p)IUp,„(«Af) I ^p) 

<■^1 Vlog (1 N{S, Pp.„(5a,), L2«))) ds 

” (9.26) 

where 6p,n{N) = ||»?^||^ On the other hand, we clearly have that, 

for every (J > 0, 

N(<J,Pp.„(<J),L2«)) < N(rf,Pp.„(oo),L2«)) < N^S/2,Pp,n,HVp)) 

where we recall that Pp,„ = Gp,„ • Pp,„ P„. Under our assumptions, it thus 
follows from Lemma 7.3.4 in Section 7.3 that 



N(5,Pp.„(<5),l2«)) <^(^/2,P„) 
Using (9.26), one concludes that, for every N > 1, 





9.6 A Donsker Type Theorem 321 



By the dominated convergence theorem, to prove 1) it suflSces to check that 
Jim = P-a... (9.27) 

We establish this property by proving that 

a) 

Let /,h € be chosen so that t?„((/ - h)^) < (i.e., f - h e 

Use the Cauchy-Schwartz inequality to see that 

<7,p(gJ,„Pp,„((/-/i)2)) (9.28) 

Since 0 < Gp,„ < 1, the right-hand side of (9.28) is bounded above by 

which is less that S^. This ends the proof of a). To prove b), first note that 
hp ~ '^p\\:fi „(8n) - 

Now, to prove b), it certainly suffices to show that 

supAT(<J,p2^(oo),L2(A‘)) < 00 

M 

for every 5 > 0. Since all functions in have norm less than or equal to 
1, we have \\f - < 4||/ - for any f,h in Pp,„(oo) and 

any fx G V{Ep). It follows that, for every <5 > 0, 

iV(<J,P^,„(oo),L2(M)) < Ar(<J/4,Pp.„(oo),L2(M)) 

Since fV((J,Pp,„(oo),L 2 (/i)) < iV^^((5/2,Pp,„,L2(/i)), one concludes, using 
Lemma 7.3.4, that 

s^lpN{S,:FlJoo),L2{^i)) < sup7V2(J/8,:F„,L2(/i)) 



This ends the proof of b) and 1). In the same way, by dominated conver- 
gence, the proof of 2) is an immediate consequence of (9.27). This completes 
the proof of the lemma. ■ 




322 



9. Central Limit Theorems 



9.7 Path-Space Models 

This section discusses the fluctuations of the path-particle McKean mear 
sures ^ restrict our study to the simple mutar 

tion/selection genetic model associated with the first McKean interpretar 
tion model .) = ^niv) limiting McKean measures reduce 

to the tensor product measures Kn = To simplify the presen- 

tation, we further assume that the state spaces are homogeneous, £„ — E, 
and the Markov transitions M„ satisfy the regularity condition (M)®*** in- 
troduced on page 116, for some kernels kn and some reference measures Pn- 
Under these conditions we recall (see page 264) that the law of the iV-path 
particles 



Pf ) = Law((^’, . . . ,C)i<i<;.) G 

is absolutely continuous with respect to the tensor product measure K®^ 
and 






= expi/W lK®^.a.e. 



The interaction potential function is defined by 

= j log dm(xp) 



with m(xp) = jf 'Zi^i 

To clarify the presentation, we shall simplify the notation suppressing 
the time parameter n so that we write instead of 

We also write U = C = ^ = 

{^')i<i<N, and Etv(.) (resp. En{-)) denotes the expectation with respect 
to the measure K®^ (resp. p(^i) on 
To get the fluctuations of the particle McKean measmes, it is enough to 
study the limit of their characteristic functions 

<p € L 2 (K) Ejv(exp (iW^(v?))) where = \/N [K^ - K] 



Writing 



E;v(exp (iW^{<p))) = E^,(exp (tW^(y)) -h 

one finds that the convergence of Ejv(exp (tW^(^))) follows from the con- 
vergence in law and the imiform integrability of the sequence 

under the product law K®^. The last point is clearly equivalent to the 
uniform integrability of under K®^. The proof of the uniform 




9.7 Path-Space Models 323 



integrability of expH^^\^) then relies on a classical result that says that 
if a sequence of nonnegative random variables converges almost surely 
towards some random variable X as TV oo, then we have 



lim E(Xn) = E(X) < 00 <=> {Xn ; TV > 1} is uniformly integrable 

N-^oo 

The equivalence still holds if Xjv only converges in distribution by Sko- 
rohod’s theorem. Since En{cxpH^^\^)) = 1, it is clear that the uni- 
form integrability of follows from the convergence in distri- 
bution under (K)®^ of towards a random variable H such that 

E{expH) = 1. Thus, it suffices to study the convergence in distribution 
of iW^{(f) + for L 2 (K) functions ip to conclude. To state such a 

result, it is first convenient to introduce another round of notation. Under 
our assumptions, for any x = (xq, . . . , x„) and z = (zq, . . . , z„) € we set 



n 

g(x, z) = Y, <lpix,z) 

P=1 



and a(x, z) 



q{x^ z) - q{x\ z) K(dx') 
Jii 



with 



qp{x^z) — Gp_i(zp_i) kp{zp-i^ Xp) /rjp-i{Gp-ikp{. , Xp)) (9.29) 

One consequence of (TV/)®^p is that the integral operator A given for any 
(p e L 2 (K) by 



A((/?)(x) = / a{zyx) (f{z)K{dz) 

Jn 

is a Hilbert-Schmidt operator on L 2 (K). 

We are now in a position to state the main result of this section. 

Theorem 9.7.1 Assume that condition (TI/)®^p is satisfied. The integral 
operator I — A is invertible and the random field : (p G L 2 (K) 
W^{(p) converges as N oo to a centered Gaussian field W : (p £ 
L 2 (K) W{(p) satisfying 

E{W{pi)W{ip2)) 

= ((/ - A)-\pi - K(vJi)), (/ - A)-H<P2 - K(¥»2)))l,(k) 

for any € L 2 (K) in the sense of convergence of finite-dimensional 

distributions. 

The basic tools for studying the convergence in law of are 

the Dynkin-Mandelbaum theorem on symmetric statistics and Shiga and 
Tanaka’s formula of Lemma 1.3 in [286]. The detailed proof of Theo- 
rem 9.7.1 is given in [84] . Here we merely content ourselves with describing 




324 9. Central Limit Theorems 



the main line of this approach. Let us first recall how one can see that I- A 
is invertible. This is in fact classical now (see [286] for instance). First one 
notices that, under our assumptions, A’', n > 2, and A A* are trace class 
operators with 

trA^ = J...J a{x\x^)...a{x^,x^)K{dx^)...K{dx^) 

= / a(x,^)2K(dx)K(dz) = l|a||^(K®K) 

Jn 

Furthermore, by the definition of o and the fact that K is a product mear 
sure, it is easily checked that tr^4” = 0 for any n > 2. Standard spectral 
theory then shows that det 2 (/ - >1) is equal to one and therefore that I- A 
is invertible. 

The identification of the K®^-weak limit of (^) relies on L 2 -techniques 

and more precisely Dynkin-Mandelbaum construction of multiple Wiener 
integrals as a limit of symmetric statistics. To state such a result, we first 
introduce Wiener integrals. Let € L 2 (K)} be a centered Gaus- 

sian field satisfying E(/i((^i)/i((/J 2 )) = (v’1)V’2)lj(K)- If set, for each 
(f e L 2 (K) and m > 1, 

the multiple Wiener integrals ; <fi € L 2 (K)} with m > 1 are 

defined by the relation 

^ hi ~ y IIv’IIljw) 

m>0 ^- \ / 

The multiple Wiener integral /m(^) for (f> € L 2 ,.y„(K®"*) is then defined 
by a completion argument. Theorem 9.7.1 is therefore a consequence of the 
following lemma. 

Lemma 9.7.1 Let ^ ^ <* collection of independent and identi- 

cally distributed random variables with common law K. We have 

lim i/ 2 (/) - UtAA* (9.30) 

N-foo Z I 



where f is given by 

f{y,z) = a{y,z) + a{z,y)- a{u,y) a{u,z)K{du) (9.31) 

Jn 

In addition, for any (p € L 2 (K)> 



lim (H^^\x) -h tW^(vj))‘= i/ 2 (/) + ihip) - UtAA^ 
V-¥00 L I 




9.7 Path-Space Models 325 



Following the observations above, we get for any ^ € L 2 (K), 

En (expiW^ {(p)) = Jim Es (Gxp{iW^{(p) + 

= E(exp(i/i(^)-hi/2(/)-itr^^*)) 

Moreover, Shiga and Tanaka’s formula of Lemma 1.3 in [286] shows that 
for any ¥> € L 2 ,,y„(K), 

E ^exp (^ih{ip) -I- ^hif) - ^tr^i4*^^ = exp - >1)"V||L(K)) 

(9.32) 

The proof of Theorem 9.7.1 is thus complete. The proof of Lenuna 9.7.1 
relies entirely on a construction of multiple Wiener integrak as a limit of 
symmetric statistics. For completeness and to guide the reader, we present 
this result. 

Let {C* ; * > 1} be a sequence of independent and identically distributed 
random variables with values in an arbitrary measurable space {X, B). To 
every symmetric function h{zi, . . . , Zm), there corresponds a statistic 

<’»= E MC C-) 

l<ii<...<im<N 

with the convention = 0 for m > N. Every integrable symmetric statis- 
tic 5(C^, • • , C^) a imique representation of the form 

5(C\-.-,C'")=E (9.33) 

m>0 

where • • • > ^m) are symmetric functions subject to the condition 

j hm{zi , . . . , 2m-l, tl) fi{du) = 0 (9.34) 

where n is the probability distribution of We call such functions hm, 
m > 0 canonical. Finally, we denote by H the set of all sequences 

h = {ho, hi{zij , . . . , hfn{Zi , . . . , Zm), • • •) 

where hm are canonical and Y!,m>o • • • > C*”)) < oo- As in [286], 

the proof of Lemma 9.7.1 is essentially based on the following theorem. 

Theorem 9.7.2 (Dynkin-Mandelbaum [129]) Forh € H, the sequence 
of random variables Zn{h) = X)m>o converges in law and 

asN-^ 00 ,toW{h) = ^m >0 An(^m)/m! 




326 9. Central Limit Theorems 



Proof of Lemma 9.7.1: (Sketched) 

It is first useful to observe that for any /x € ViE) and p > 1 we have that 

d$p(p), d^pin) niGp-i{.)kj,(.,x)) / (ijGp-i) 

dr,, ,,p_i(Gp_i(.)MMa:))/»?p-i(Gp_,) 

By the definition of K and q, (see (9.29)), we note that 



Jq Jq Vp-i\^p-i ^pV’j^p// 

_ f Gp-i(zp-i) kp[zp^iyXp) 

Je Vp-i{^p-i ^p(*?^p)) 



.) 



_ Gp-i(zp-i) 
^p-i(Gp-i) 



^ r?p-i(Gp 1 M. xp)) 
r/p_i(Gp_i j 



Therefore the symmetric statistics x 6 1)^ — ^ can be rewritten 



as 



ff''''w=EE 

p=l t=l 

where 



/ 1 ^ \ / 1 ^ 

hv E 9p(^) 

V i=i / V J =1 



gp(r') = Gp_i(x^_i)/»jp_i(Gp-i) = ^ 9p(p,x^) K(%) 

By the representation 

, , ,, (tt-l )2 («-l )3 

logu = (u- 1) + 



3(e« + (l-£))3 



which is valid for all u > 0 with e = e(u) such that e(u) € [0, 1], we obtain 
the decomposition 

= sEE»(^‘.>^)- 5 Ee(sE'''><’^‘'^)-'') 

t=lj=l ^p=li=l\^i=l / 

+f E(^EM*")-lj +fiW (9.35) 

where the remainder term cancels as N tends to oo. The technicd 
trick is to decompose each term as in (9.33) in order to identify the limit by 




9.8 Covariance Functions 327 



applying Theorem 9.7.2. For instance, the first term can readily be written 
as 

i=l i<j 

with 

I a{z, z) K{dz) = f a(x, z) K{dz) = f a{z, x) K(dz) = 0 
Jci Jn Jn 

for any x € and therefore a clear application of Theorem 9.7.2 yields 
that it converges in law as AT oo to 5/2(0 + a*). ■ 



9.8 Covariance Functions 

We use the same notation and the same regularity conditions as the ones 
used in Section 9.7 We have proved that the random field 

yp € L2(K) ^ W^{^) = v/ivf i . . . ,C) - K(^)^ 

' i=l ' 

converges as iV — ► oo, in the sense of convergence of finit- dimensional 
distributions, to a centered Gaussian field {W{(p); (p € L^(K)} with covari- 
ance 



E{W{ifi)W{ip2)) 

= ((/ - - K(^i)), (/ - 

for ¥>1, ¥>2 € L^(K). From the observations above, it follows that the process 
/ € L2(»7n) ^ W2'^{f) = y/N - ;?„(/)) 

converges in the sense of convergence of finite-dimensional distributions and 
as iV -> 00 to a centered Gaussian field {W^{/); / € L^(t/„)} satisfying 

E(W„^(/)W„^(h)) 

= ((/ - A)-\f - Vn(f))^\ (/ - A)-Hh - 




328 9. Central Limit Theorems 



for any /, h e L^iVn), where /®‘ *^= 1 ® ® 1 ®/, for all / € h^{r}n)- 

^ ■ V* ■■ '' 

(n- 1 ) times 

Prom (9.14) we also know that the covariance function is given by 

= IZp=0 / (^P.«(/) “ ^n(/)) {Pp,n{h) - T}n{h))dTJp 

In the next proposition, we invert the integral operator and check that 
these two expressions do coincide. This reassuring lemma gives some precise 
information on the decompositions to use to obtain fluctuations of particle 
density profiles. As an aside, we mention that it was in fact at the origin 
of our study of CLTs for particle density profiles. 

Proposition 9.8.1 For every f € L^(T/n), and every zq,. . . ,z„ € E, 

(/ - (/ - r/„(/))®\zo, . . . , z„) = /p,n(«p) 

p=0 



where the functions fp^n, 0 < p < n, are given for any p<n by 

/p.n = -^^(Pp,n(/)-»7n(/)) 

Vp\^p,n) 

with the convention Gn,n = 1 = Id- 

Proof: We first note that 

f Gp_i Mp{Gp^n) f ^p{Gp,n Pp,n{f)) 

~ T7p^l(Gp^l,n) V Afp(Gp,n) 

{Mp{Gp^n PpyTiif)) “ ^pGp^nVnif)) 



Gp-i 



r/p_i(Gp-i,n) 



and therefore 



f - 9 l~ 



1-1 Vp{Gp,n) ^ ( 
-l(Gp-l,n) ” V 



^P.n 

Vp{^p,n) 

= Gp-i , ' Mp(/p.„) 



(P,,n(/)-Xp(/)) 



) 



J/p-l(Gp_l,n) 



Then, using the fact that 

\ Vp-l{Gp-l Mp{Gp^n)) ‘np-l{Gp-l,n) 

Vp(^p,n)- r)p.r{Gp.^) - Vp-x{Gp-x) 

one easily gets the backward recursion equations, for each 1 < p < n, 

fp-X,n ~ ~ ^ r Mp{fp,n) 

Vp-x\^p-x) 



(9.36) 




9.8 Covariance Functions 329 



By the definition of A, for any (p G we have that 

Api^ZQ^ • • • » ^n) “ ^ ^ / ^(^0? • • • > ^n)^m(^mi l)K(d2To, . . . , dXfi) 
m=l*' 

where, for every 1 < m < n, 

^ ^ ^ _ ^m— l(^m-l) ^m) Gm— l(^m-l) 

^mV^mj — 7^ ; 7 TT 77; r 

^m-1 (Gm-l ) ^m-lv^m-l) 

On the other hand, we observe that since 

^ \ _ 'Hm-l {Gm-l 

Vm[(*^m) — ;; 7^ X Pm[dXm) 

Vm—1 i ^m— 1 j 

and Kfn[Zfji^iydXm) = kmi^m—h^m) Pm{dxm)y we have that 

l(^~l) ^m(^m— 1» ^m) /j \ ^?m--l(^m-l) *^ / 1 \ 

Pm{dXm) = 77; 'r Mm{Zm-udXm) 



Vm—l ( Gm—l ^m(*> ^m)) 

Therefore, 

(/ - i4)v?(^o, . . . , ^n) = , ^n) 



^m— l(Gm_i) 



j ‘Pixo, --,x„)ril^^„]{zm-i]dxo,...,dXn) 



where 



(9.37) 



^[0.n)(^m-i;tte0, • • • idXn) = (t/o ® • . . r/m-l)(dXo, . . • 

^ iVm ^ Mm{Zm—h •)){dXfn) X (T/rn+1 ® • • * ® ^n)(dXin-}-l» * . . , 

Now choose (p given for any zq^ ••• y Zn ^ E by 

n 

P{zQj . . . , Zfi) = fp,n{Zp) 

p=0 

Then, we get 

(/ - A)v?(2:o, • • • , 2n) = ¥>(«0, • • • , ^n) “ V Mm{fm,n){Zm-l) 

“1 »?m-l(Om-l) 

Finally, using the backward equation (9.36), one concludes that 

n 

(•f • • • ? ^n) ~ ^{ZOy • ' • y Zfi) y ^ /m— l,n(^m— l) 




330 



9. Central Limit Theorems 



so that the result follows from 



(/ - A)tp{ZQ , . . . , 2„) = /„,n(Zn) = /(^n) ~ »?n(/) 



This ends the proof of the proposition. 




10 

Large-Deviation Principles 



This chapter focuses on large-deviation principles for interacting particle 
models. The main object of the theory of large deviations is to provide sharp 
exponential estimates of the deviant behavior of random rare events. For 
instance, in the context of interacting particle models, these events repre- 
sent the deviations of particle approximation measures aroimd the “nonde- 
viant” limiting McKean distribution or aroimd the solution of the limiting 
measiure-valued equation. We have already analyzed some rather crude ex- 
ponential decays of these deviation probabilities in Section 7.4. Although 
these estimates were not asymptotic, they were far from being sharp. These 
exponential decays are sometimes called strong large-deviation estimates 
by some authors. In this context, a large-deviation analysis will provide 
sharp and precise estimates. Before entering into more details on this sub- 
ject, it is useful to give some comments on the origins of the theory of large 
deviations and its connections with other scientific disciphnes. 

The theory of large deviations probably started around the 1930s with 
the works of A.I. Khintchin [201], N. Smirnoff [290], and H. Cramer [62]. 
These pioneering studies in this subject were motivated by refining the 
estimates provided by the central limit theorem. Since that period, the 
range of applications of large-deviations analysis has constantly increased, 
going from particle physics [300], dynamical systems [15, 301], stochastic 
search algorithms [15, 44], statistics [19, 20, 70, 211], statistical mechan- 
ics [61, 133] and pure probability [1, 2, 3, 118, 119, 120, 293, 300]. The 
foundations of the modem theory of large deviations in abstract topolog- 
ical function spaces were laid in celebrated articles by S.R.S. Varadhan 
[300] and by A. A. Borovkov [36]. There are actually several interesting 




332 



10. Large-Deviation Principles 



and complementary textbooks devoted to this subject. We refer the reader 
to [112, 114, 127, 133] for a detailed accoimt on this topic and a complete 
list of references. 

More recently in the late 1980s, V.P. Maslov presented in [240] a theory 
of integration of functions taking values in an idempotent semiring. The de- 
velopment of this idempotent analysis was motivated by the study of some 
Hamilton-Jacobi equations arising in particle physics. This idempotent in- 
tegration theory is built in the same fashion as the traditional Lebesgue 
integration. In this context, lower semicontinuous functions can be regarded 
as idempotent measures, and good rate functions governing large-deviation 
principles correspond to idempotent probability measures. These interpre- 
tations shed some new light on the connections between probability theory 
and stochastic processes on the one hand and deterministic optimization 
theory and controlled dynamical s}rstems on the other hand. They also 
allow probabilistic intuition to enter into deterministic optinaization and 
decision processes. For more details on idempotent probability measures, 
we refer the reader to [77, 78, 79, 205, 272]. Prom these points of views, the 
theory of large deviations can be interpreted as an anal}rtical bridge be- 
tween idempotent functional anal}rsis and deterministic decision dynamical 
systems and the traditional theory of probability and stochastic processes. 

The chapter has the following structure. In Section 10.1, we present our 
main large-deviation results on discrete generation interacting processes, 
namely a large-deviation principle for particle McKean measures on metric 
spaces and the extension of the Sanov theorem in the strong topology 
to interacting processes in general Hausdorff topological spaces. In this 
preliminary section, we also introduce the general definition of what is 
meant by a large-deviation principle. We interpret these principles in the 
context of particle approximation models and connect these sharp estimates 
with the exponential inequalities presented in earlier sections. We shall see 
that these asymptotic estimates are strongly related to the choice of the 
topological structure of the state spaces. We illustrate these topological 
questions with a brief discussion on the weak and the strong topologies in 
distribution spaces. We already mention that the strong r-topology on the 
set of probability measures over some measurable space is in general not 
metrizable and not even first countable. This simple observation indicates 
that the large deviation analysis of these models has to be conducted in 
abstract and general topological spaces. 

In Section 10.2, we have collected some essential topological properties 
of lower semicontinuous functions with an emphasis on their idempotent 
measure interpretation. These elementary results, such as the well-known 
contraction principle, will be of current use in various places of this chapter. 

In Sections 10.3, 10.4, and 10.5, we underline three generic strategies 
to obtain a large-deviation principle, namely Crdmer’s method, Laplace- 
Varadhan integral techniques, and Dawson-Gartner projective limit tech- 
niques. The first and third techniques will be used in Section 10.8 to analyze 




10.1 Introduction 333 



the deviations of flows of particle density profiles in the r-topology, extend- 
ing a strong version of Sanov’s theorem due to Groeneboom, Oosterhoff, 
and Ruyggaart [170]. The second strategy will be used in Section 10.7 to de- 
velop a large-deviation principle for particle McKean measures simplifying 
and extending a joint work of the author with A. Guionnet [83]. In each of 
these three sections, we propose a comprehensive and self-contained treat- 
ment on these methods (except for the large-deviation lower bound in the 
generalized Cramer’s method,, see Theorem 10.3.1). In addition, we com- 
plement these traditional techniques with some recent results. Section 10.4 
contains a complement of the classical integral lemma of Varadhan re- 
cently presented in a joint work of the author with T. Zajic [109, 110]. 
In Section 10.5, we revisit Dawson-Gartner techniques. We also propose a 
new, simple proof of the strong version of Sanov’s theorem based on these 
projective ideas and multinomial expansions in the spirit of Groeneboom, 
Oosterhoff, and Ruyggaart [170]. 



10.1 Introduction 

We start with some familiar notation on discrete generation measure-valued 
processes and interacting particle models. We let € N, be a collection 
of Polish spaces^ with Borel tr-fields €n. We also consider a probability 
measure % € V{Eo) and a collection of measurable mappings $„+i from 
V{En) into V{En+i). We associate with the latter sequence of mappings 
the nonlinear measure-valued equation 

Vn = ^niVn-l) (10.1) 

At this stage, it is worthwhile to recall that the (nonlinear) distribution flow 
Tin 6 V{En) can be interpreted as the laws of the random states of a non- 
homogeneous Markov chain A„. Each of these Markovian interpretations 
corresponds to a possibly different sequence of Markov transitions Kn+i,f) 
from En-i into and satisfying the compatibility condition 

V^n,r) — ^niv) ( 10 - 2 ) 

for any t] G V{En) and n > 1. We refer the reader to Section 2.5.3 for 
several examples of compatible transitions in the context of Feynman-Kac 
semigroups. We associate with a given compatible sequence of transitions 
Kn,f) a filtered and canonical probability space 

(fin ~ -^[O.n]! p)o<p<fi) (.^p)o<p<ti) K,}o,n) 

with the McKean probability measure K,^,„ € P(E[o,„]) defined by 

• • • ) ^*»)) “ %(* 1 *^) -l^li»jo(^> *1^1) ••• Rn,» 7 „_i (Un— 1, dttn) 



^i.e., complete and separable metric space. 




334 10. Large-Deviation Principles 



Under K,^,n, the canonical sequence (Xp)o<p<n forms a Markov chain with 
transitions and r]p = Law(Xp), 0 < p < n. The iV-particle as- 

sociated with the McKean interpretation above is a Markov chain ^p = 
i^p)i<i<N 6 £^p with initial distribution ijq^ and elementary transitions 



Proba(^p 6 d(Xp, . 



N 



■ .<) I {f-i) = n 



»=i 



The iV-particle approximation McKean measures 6 7^(E|o,„]) and 
the corresponding density profiles e V{En) are defined by 




i=l 









i=l 



In earlier sections, we developed a series of asymptotic results on these 
random distributions as the size of the systems N tends to infinity, including 
weak and strong laws of large numbers, Lp mean error analysis, exponential 
decays, and central limit theorems. The latter fluctuation analysis provides 
sharp asymptotic estimates of the Lp mean errors 

and y/N[r,^-rj^] 

In Section 7.4.2, we also derived a collection of exponential estimates (see 
for instance (7.19) in Theorem 7.4.2 or (7.21) in Theorem 7.4.3). This 
concentration analysis deals with events where differs from rjn by an 
amount of order iV, well beyond the fluctuation order y/N that is described 
by the central limit theorem. One of the challenging questions we address 
here is to obtain tight and asymptotically sharp exponential bounds for the 
asymptotic decay rate as N oo of 



P(|[IfC.,„-Kn.,o](^n)|>e) (10.3) 

or 

mblo - »/o](/o)| > eo, . . . , |[t/^ - j?n](/n)| > fn) (10.4) 

where £,£q, ^ 0, Fji € ^b(.^[o,n])) mid fp G 0^(.^p), p ^ u, is an 
8u:bitrary sequence of test functions. 

More generally, we can consider the particle McKean measures and 
the approximation flow (7/^ )o<p<n as sequences of random variables tak- 
ing values in P(£?(o,„]) and in the Cartesian product Ilp=o^(^p)- 
interpretation, it is first convenient to equip these distribution spaces with 
an appropriate topology so that open and closed subsets 6ire well-defined. 
Furthermore, these topologies must also be “compatible” with the previ- 
ous analysis in the sense that the deviant events in (10.4) have to be the 
complementary subset of some open neighborhoods V(7?n) C V{En) of 7j„ 




10.1 Introduction 335 



and we have that lim^_».oo fivi! ^ ^iVn)) = 0. In this topological context, 
large deviations describe precisely the increasing “deviant” behavior of the 
event {r)j^ ^ V(»?„)}. 

At this stage, the reader may wonder what exactly is meant by a large- 
deviation principle. For later purposes, it is convenient to start here with 
an abstract and general definition. 

Definition 10.1.1 Let M be a topological apace equipped with a a-field 
(j{M). We say that a sequence of probability measures {P^)n>i on the 
measurable apace {M,a{M)) satisfies the large-deviation principle (abbre- 
viated LDP) with rate function H or, equivalently, that the rate function 
H governs the LDP of P^ if the following two assertions are met: 

• H is a lower aemicontinuous mapping from M into [0, oo] (the value 
00 is not excluded) 

• For any A e a(M) 

-H(A) < liminflog-^P^(A) < limsuplog^P^(A) < -H(A) 

N-*oo N N-^oo 

(10.5) 

O 

where A and A denote respectively the interior and the closure of A 
and for any Ac M we have used the notation H{A) = inf,„e^ H{m). 

The function H is said to be a good rate function when its level seta 
^“'([0,0]), a e (0,oo), ore compact subsets in M. 

Note that this definition strongly depends on the topology of the state 
space as well as on the choice of the a-algebra. Before attempting to get 
into more details, we begin by presenting one traditional example of topol- 
ogy currently used in the context of interacting particle models. Let M{E), 
the space of all finite and signed measures on a Polish space {E, €) equipped 
with the weak topology generated by Cb{E). For the convenience of a non- 
initiated reader, we recall that each / € Cb{E), can be identified as a point 
Pf of the algebrfuc dual M{Ey of M{E) via the map pf : Kf)- Since 

the collection of these functionals p p{f) € R, / 6 Cb{E), is a separating 
vector space in M{E), the Cb(P)-topology makes M{E) into a locally con- 
vex topological space and the set Cb{E) M{E)* can be identified with 
the topological dual of M(E) with the duality relation 

(f,p) € (Cb(E) X M(E)) — ^ / f dp eR 

Je 

In addition, the set V{E) of all probabiUty measures on {E, £) as & convex 
subset of the locally convex topological space M{E) is a closed subset in 
M{E), and it is again a Polish space when endowed with the relative weak 
topology. This weak topology (dso called the vague topology) is generated 
by the sets 

VfAt^) = {qeP{E) : \vif)-p{f)\<e} 




336 10. Large-Deviation Principles 



where / 6 Cb{E), fi € PiE), and e € (0, oo). As the reader has certainly 
noticed the terminology *Veak convergence” is not a correct mathematical 
notion. In a pure mathematical sense, since Cb{E) M(E)*, the M(E)*- 
topology on M(E) coincides with the traditional and official weak-* topol- 
ogy. Since all pure and applied probabilists seem to have adopted this 
convention, we shall follow this abusive mathematical terminology. 

If we take E = iln = JE?(o,nj f = Fn € Ct(f?[o,n])) then the de- 
viant event presented in (10.3) is equivalently expressed in terms of a basis 
neighborhood of the McKean measure; that is. 



P(lK« - K.,.„l(fn)l > t) = P(K"., i Vf.,.(K„,*)) 

One app{u:entiy innocent observation is that the function F„ has to be 
continuous on the path space ^[o,n] so that the deviant event is a closed 
set in this weak topology. This shows that the LDP analysis on P(£^[q,„]) 
furnished with the weak topology fails to describe the desired deviant be- 
havior of the particle approximation measures on all bounded measurable 
functions. Therefore it fails to describe the desired deviant behavior on in- 
dicator functions of measurable sets! We will return to this “topological” 
trouble in the further development of this section. 

With these preliminaries out of the way, we are now in a position to 
describe with some precision one of the main results developed in this 
section. Let En be the mapping defined by 

n 

E„ : G V{Eio,n]) — ^ S„(/x) = (/Xp)o<p<n e P’'{E) = n P(Ep) (10.6) 

p=0 

where fip stands for the pth time marginal of n with 0 < p < n. We associate 
with a given sequence of distributions v = (t'p)o<p<n G V'^{E) the measure 
Q,/ G P(£^[o,n]) defined by 

Q„(d(«o,...,tt„)) = rto(d«o) Ki,„o{uo,dui) . . . Kn,u„_i{un-udun) 

It is important to note that the McKean distribution defined by KtK,,n is a 
fixed point M(K,^,„) = K,^,„ of the mapping 

M : /i G P(£[o,n]) M(/x) = Qs„(p) € P(£^[o,„]) 

We are now in a position to state the first result of this section. 

Theorem 10.1.1 Suppose that, for each pair of measures for 0 < p < 
n and Up_i G Ep-i, the measures {Kp^p{up-i, .))m€P(£p_i) mutually 
absolutely continuous. Also suppose that for any aeR the mappings 

peP{Ep-i)^ f /i(dup-i) logZ^(/x,7/)(«p_i) (10.7) 

J Ep-i 




10.1 Introduction 337 



with 






f dKp^fi{up-i, .) . , 

Je, [dKp,,iup.u.y 



^p,v(V'p~^'dUp) G (0, oo) 



are bounded and continuous at each n = t) for the weak topology. Then 
the law of the N -particle measures satisfies the LDP in P(E[o,„]) 
equipped with the weak topology and with the good rate function In given by 



ViEio^n]) — > Hp) = Hpifi) = Ent(/i | M(/i)) G (0, oo] 

In addition, we have /n(/i) = 0 if and only if p = K;jo,n- 

In the context of Feynman-Kac models, we provide in Section 10.7 a 
set of simple sufficient conditions on the pair (Gn, Mn) under which the 
regularity conditions stated in the theorem above hold true. 

To relax the regularity conditions needed in Theorem 10.1.1, we will 
simplify the analysis and we will work with the flow of the particle den- 
sity profiles {r}p)o<p<n associated with the first McKean interpretation 
•) = ^niv )- Section 10.2 (see Theorem 10.2.1), we shall see 
that the LDP is preserved under continuous mappings. Since is a con- 
tinuous mapping, we readily deduce from Theorem 10.1.1 that 



Sn(K^,o) = «)o<p<„ 

satisfies the LDP on V{V'^{E)) (equipped with the weak topology) with 
the good rate function defined by 



n 

Jn{{t^p)o<p<n) — ^ '] Ent(/Xp I $p(/ip_i)) 

p=0 

with the convention $ 0 (^- 1 ) = t/o for p = 0. Note that Jn((Pp)p<n) = 0 
if and only if (Pp)p<n satisfies the nonlinear equation (10.1) starting at 
po = qo- Of course, this result does not imply the LDP for the particle 
McKean measiures but, as we mentioned earlier, we will push forward the 
LDP analysis to another natural and much stronger topology than the 
previous one. More precisely, we shall assume that the state spaces E„ are 
Hausdorff topological spaces furnished with the Borel <r-field and the set 
of probability measures V{En) are endowed with the (strong) r-topology. 

We recall that the r-topology on the set of probability measures V{E) 
on an Hausdorff topological space {E, S) equipped with the Borel <T-field £ 
is the topology generated by the sets 

W/,.(p) = {qeP(E) : |r,(/)-p(/)|<e} 

where / G Bb{E), p G V{E), and e G (0,oo). Since the functions / appear- 
ing in this definition are measurable and bounded, the r-topology is finer 




338 10. Large-Deviation Principles 



(or stronger) them the weak topology. This natural topology is in general 
strictly finer than a topology associated with a given Zolotarev t)rpe semi- 
norm (see Section 7.3). For instance, when E = and T = {l(_oo,i) ; x € 
R**}, the topology induced by the supremum distance 

IIm - vy = sup |M(-oo. a:]) - »?((-oo, i])| 
i€R<* 

is strictly coarser than the r-topology (see [170]). Consequently, we note 
that the sets Ue{r)) = € V{E) : ||/i - r;||^ < e}, where rf € ViE), e > 0, 

and T C Bb{E) runs through 6dl countable collections of functions, form a 
basis and not merely a subbasis of the r-topology on V{E). 

From the previous discussion, the large-deviation analysis in the r-topology 
will provide precise information on the deviant behavior of the particle ap- 
proximation measures on any countable collection of bounded measurable 
functions. For instance, the extended version of the deviant events (10.4) 
with respect to Zolotarev type neighborhoods now reads 

m\Vo - Voho > €p, • • • , l|»7n - Vnhn > «».) 

— (iVp)o<p<n G rip=0^P.ei>(^p)*^) 

To avoid some unnecessary discussions on measurability questions, we will 
suppose that a given set of probability measures V{E) on a Hausdorff 
topological space (E,£) and equipped with the r-topology is always en- 
dowed with the smallest tr-field that makes measurable all functionals 
Pf : fie ViE) fi{f) e R with / e Bk{E). 

Theorem 10.1.2 Let be the particle density profiles associated with a 
collection of r -continuous mappings : V(E„-i) -> V{En)- IVe also as- 
sume that the mappings satisfy the following regularity condition 

{H) For any n > 1, there exists some reference measure A„ € F{En) 
and some parameters Pn>0 such that for any fin-i € (P(E„-i) we have 

Pn ^ ^n(Pn-l) ^ 

Then, the law of {rip)o<p<n satisfies the LDP in V'iE) with the product 
T -topology (and hence for the weak topology) with rate function J„. 

In the context of Feynman-Kac models, the one-step mappings 
clearly r-continuous as soon as the potential functions are strictly positive. 
Also note that condition {H) is met as soon as for any x„_i € En-i we 
have 

Pn An ^ hdn(Xn—l, •) An 

For instance let us suppose that E„ = R and 

M„(x,dy) = dy 

v27t 




10.2 Some Preliminary Results 339 



where On is a bounded measurable drift function on R. In this case condition 
{H) is met with the reference measure 



K{dy)=Pn{dy)/pn{^) with p„(dy) =def. -ie ^ 

VZTT 



and with the parameters Pn(R)- 

On page 351, we will check that the laws of the particle density profiles 
iVp )o<p<n are exponentially tight for the product weak topology on V^{E) 
as soon as are Polish. As we will note on page 350, the LDP lower bounds 
for the weak topology combined with the exponential tightness also imply 
that Jn is a good rate function for the weak topology. 



10.2 Some Preliminary Results 

10.2.1 Topological Properties 

Let M be a Hausdorff and regular topological space. We recall that a topol- 
ogy on M is a family of open subsets; that is, a family of subsets that is 
stable by finite intersections and unions. Since the union (resp. intersec- 
tion) of an empty family of sets in M is 0 (resp. Af), the sets 0 and M 
are open. A subset A is said to be closed if M - A is open. This implies 
that the sets 0 and M are also closed. The Hausdorff property ensures that 
single points are closed and every two distinct points have disjoint neigh- 
borhoods. The regularity property refers to the fact that any closed set and 
any point outside this set have disjoint neighborhoods. In this situation, 
for any neighborhood A of u € M, we can find a subneighborhood B of u 
such that B C A, 

For a detailed accoimt on different classes of topological spaces, we re- 
fer the reader to the comprehensive introductory book on topology by 
Dugundji [126] (see for instance p. 311 for a detailed picture on different 
topological spaces). 

In general, a a-field a{M) on M is too small and the open and closed sets 

o 

A and A are not necessarily measurable. Nevertheless, there always exists 
a unique smallest o-field B{M) containing the topology of M . This <T-field 
is generated by the set of all open (or all closed) subsets of M. B{M) is 
traditionally called the Borel a-field. When B{M) C <r(M), it is readily 
checked that the bounds (10.5) are equivalent to the following: 

1. (Upper bound) For any closed subset A C M 

limsup ^ logP^(A) < -H{A) (10.8) 

N-yoo ■'V 




340 



10. Large-Deviation Principles 



2. (Lower bound) For any open subset Ac M 

hwllogP^(^) > -H{A) (10.9) 

Definition 10.2.1 When a sequence of distributions only satisfies the 
LDP upper bounds (10.8) for a compact set, we say the P^ satisfies a weak 
LDP. 

Definition 10.2.2 A function V : M K U {-oo} is lower semiconti- 
nuous (which we abbreviate is.c.) if, for each a € R U {-oo} and u e M 
such that V (u) > a, there corresponds a neighborhood A of u such that 
V{v) > a for any v € A. The upper semicontinuity (which we abbreviate 
U.S.C.) is defined in the same way by reversing the sense of the inequality 
in both cases. When the level sets of a nonnegative is.c. function H are 
compact, the function H is called a good rate function. 

It is evident that V is l.s.c. iff its level sets ^^“^([-ooja)), a e R, are 
closed. We also clearly have that V is l.s.c. iff (— F) is u.s.c. One important 
property is that Ls.c. functions alwa}rs achieve their infimum over compact 
sets. Note that for good rate functions this property is also met over all 
closed sets. Many examples of Ls.c. can be provided using the simple fact 
that the pointwise supremum of a family of l.s.c. (and hence continuous) 
maps is an Ls.c. function. In particular, if t with each Fn Ls.c., then 
V is Ls.c. 

10.2.2 Idempotent Analysis 

Nonnegative Ls.c. functions arise in a variety of research areas, including 
convex analysis, game theory, and optimal control theory. They are often 
used to describe the cost or the performance attached to some decision 
process with respect to a given optimal policy. The theory of idempotent 
measures provides a natural and rather general functional model to de- 
scribe and analyze most of these optimization problems. In this context, 
Ls.c. functions arise as the limiting idempotent measures associated with a 
sequence of distributions in a logarithmic scale. The aim of this short sec- 
tion is to provide an introduction on the relations between large-deviation 
analysis and idempotent measures theory (see [205, 240]). These ideas in- 
duce a new, modern way of thinking about large-deviation principles. We 
first need quite a bit of preliminary notation. 

We define the N-logarithmic addition/multiplication operations (0^, ©) 
on R U {— oo} and denoted by 

a 0^ 6 = ^ log (e^“ -I- e^**) and aQb = a + b ^ log (e^“ . e^**)^ 




10.2 Some Preliminary Results 341 

It is easy to see from these definitions that -oo and 0 are the neutral 
elements of the operations and 0 and the set (R U {-oo}, ©w, 0) is a 
semiring. It is important to note that for any sequence of numbers an and 
bm we have 

limsup(aAT ©^ 6^) = (limsupoAf) © (limsupbAr) with o©6 = aV6 

iV->oo N-^oo N-^oo 

To prove this assertion, we use the fact that 

0<{a®^ b-a®b)<l/N (10.10) 

and limsupjv_^oo(ajv © bn) = (limsupjv_^ooaAr) © (limsup^_^oo6jv). Note 
that the display above is in general not met if we replace limsup;v_^< 3 o 
liminfjv_+oo but we have 

(liminf a^) © (liminf &iv) < lim inf(aAr ©bysr) = liminf(ojv ©^ bw) 

N-¥oo N-^oo N-¥oo N-^oo 

For instance for as = (-1)^ = -b^r, we have 

ow © b^v = 1 > (liminf ajv) © (liminf b;/) = -1 
N-¥00 N-¥00 

This dequantization of the semiring (R U {-oo}, ©^, 0) into the so-called 
(max, -I-) semiring (RU {-oo}, ©, 0) is the cornerstone of idempotent mear 
sure and functional analysis. 

Let 5fc(M,RU {-oo}) be the set of 8dl upper-bounded measurable func- 
tions from M into R U {-oo}. Idempotent analysis is concerned with the 
study of linear operators on Bb(M,R U {-oo}) and taking values in the 
semirings associated with the previously defined operations. 

Definition 10.2.3 A (©^, Q)-integml operator is linear mapping Ln from 
the set Bb{M, RU{— oo}) into the semiring (RU{— oo}, ®^, 0). In reference 
to the traditional theory of integration, sometimes we use the notation, for 
any f € Bb{M, R U {-oo}), 

M/)=y fQdiN 

A {®,Q)-integral operator L is defined in the same way by replacing the 
logarithmic operation ©^ by ©. 

To emphasize the role of this functional framework in large-deviation anal- 
ysis, we introduce the following definition. 

Definition 10.2.4 Let be a sequence of probability measures on a topo- 

logical space M equipped with a o -field a{M) D B{M). associate with 
P^ the set function 

Ls:Ae <t(M) — ^ Ln[A] = ^ logP^(A) € [-oo,0] 




342 



10. Large-Deviation Principles 



with the conventions logO = -oo. More generally, we define the N -loga- 
rithmic integral of a measurable function / : M -4 R U {-oo} by setting 

LN{f) = ^Hj e^fdP^ 

Using a simple manipulation, we check that for any A,B e <t{M) with 
AnB = iH 

LnIAuB] = Ln[A]®^ Ln{B] 

Using the same arguments, we check that the resulting integral operator 
Lff is (0^, ©)-linear in the sense that for any pair of upper-bounded mear 
smable functions fig : M ->RU {-oo} and for any a, 6 € R U {-oo} 

Lvvfa © / 0^ 6 0 <;] = a © Ljv[/] 0^ 6 O Lfflg] (10.11) 

Furthermore, using (10.10), we easily prove that 

0<Lw[f 0^ g] - LatI/ 0 p] < 1/N (10.12) 

We also introduce the limiting set functions L* and L* defined for any 
A € o{M) by 

L*[A\ = limsupLA/[j4] and L*[yl] = liminf LAf[i4] 

N-¥00 N-¥CX> 

From previous considerations, we readily check that L* is em idempotent 
probability measure on {M,a{M)) in the sense that 

L*(0) = -oo, L*(M) = 0, and L*{AU B) = L*(A)® L*{B) 

for any A,B £ a{M) with >1 D B = 0. In terms of integral operators, we 
also deduce firom (10.12) that 

L*{aQf®bQg) = aQL*{f)®beL*{g) 

for any pair of upper-bounded measurable functions f,g : M RU {-oo} 
and for any pair a, 6 € RU {-oo}. Using standard calculations, we observe 
that the formula above is met for any pair of sets (^4, B) and we have the 
idempotent property L*{A) = L*{A)®L*{A). We summarize the discussion 
above with the following proposition. 

Proposition 10.2.1 The N -logarithmic integral operators Ln associated 
with a sequence of probability measures on a topological space M (equipped 

■with a a-field a{M) D B{M)) are ,Q)-integral operators. In addition, 
the limiting operator L* = limsupyv_,go Ln is a {®,Q) -integral operator. 

To illustrate these constructions and to keep the ideas as simple as possible, 
let us examine the case where the state space is finite, M = {1, . . . , d}, and 




10.2 Some Preliminary Results 343 



equipped with the discrete Borel cr-field B{M) = a{M). Also let be 
the sequence of distributions on M defined by 

P^{du) = 6i{du) 

t=i 

for some h^{i) > 0 such that P^{M) = 1. In this simple situation, we 
readily check that 

Ln[a] = h^(i) 

R:om previous estimations, we note that for every A C B(M) we have 
- inf h*{i) < L*[A] < L*[A] = - inf 

ieA ieA 

with ]xmm£N-*ooh^{i) = h*(i) and limsupjy_^oo^^(0 = ^*(0- 
situation, we see that h* = h* if and only if P^ satisfies the LDP with rate 
function h*. More generally, we find that a sequence of distributions P^ 
on M satisfies the LDP with rate function ff : M -> (0, oo] if and only if 
for any A 6 a{M) we have 

L[A] < L*[A] < L*[A] < L[A] (10.13) 

with the idempotent measure L on <r(M) defined by L[A] = -H{A) = 
- inf A H. The (©, ©)-integral corresponding to an idempotent measure L 
of the previous form is simply given for emy / € R U {— oo}) by 

/ 0 

/ © dL = sup (/(u) + L{u)) = sup (/(u) - H{u)) 

u€M uGM 

To get one step further we recall that for any / € B{,(M,R U {-oo}) we 

O _ 

have /< / < / with the pair l.s.c./u.s.c. (upper semicontinuous) closures 
(/, /) of / defined by 

O 

/ = sup {inf f : ue A open} 

A 

f = inf {sup f : u€ A closed} 

A 

These functions are sometimes called the inferior and superior limits of /. 
To illustrate these new notions, let Ind^i be the (0, ©) indicator function of 
a set A e <t{M) and defined by Ind>i(u) = 0 for u 6 A and -oo otherwise. 
By the closure definition, it is easily seen that 



IndA= Ind^ and Ind>i = Indj 




344 



10. Large-Deviation Principles 



Consequently, (10.13) is equivalent to saying 

LCf)<L,[f]<L*[f]<L[f] (10.14) 

for any indicator functions / € {Indy, : A € In this connection, we 

already mention that Varadhan’s integral lemma (Lemma 10.4.1) can be 
reformulated in terms of idempotent measures by saying that (10.14) also 
holds true for any upper-bounded measurable function f : M -¥ RU{-oo}. 
As the initiated reader may have noticed, most of the large-deviation results 
can be reformulated into an equivalent statement in idempotent analysis. 
It is clearly out of the scope of this book to present a catalog of all of 
these connections. Nevertheless, we finally simply state that Fenchel trans- 
formations of I.S.C. functions correspond to characteristic functions of an 
idempotent probability measure. 

10.2.3 Some Regularity Properties 

In the further development of this section, we use the following natural 
idempotent measure notation. We associate with an l.s.c. function H : 
Af -¥ [0, oo] the set function still denoted by H and defined for any Ac M 
by 

H{A) = inf H{x) 

m£A 

with the convention Af(0) = oo. Note that for any A,BcM and a € R we 
have 



H{A\JB) = H{A)AH{B) and {a + H){A) = a + H{A) 

as well as H{A D B) > H{A) V H{B). We easily check the idempotent 
property of these set functions; namely, for amy A C M we have H{A) = 
H{A) A H{A). 

Definition 10.2.5 Let n : Mi -¥ M 2 be a continuous function between a 
pair {Ml, M 2 ) of Hausdorff topological spaces. Also let H : Mi [Ooo] be 
an I.S.C. function. We will denote by H o : M 2 [0,oo] the it -image 
of H defined for any «2 € M 2 by 

H o 7r“^(u2) = inf {H{ui) : Ui € Afi s.t. r{ui) = U 2 } € [0,oo] (10.15) 

This function corresponds to the 7r-image of an idempotent measure. In- 
deed, for any A c M 2 , we have 

{Hoit~^){A)= inf (/f ojr”^)(m 2 ) = inf H{mi) = H{it~^{A)) 

m2£A mi€ir"^(i4) 

(10.16) 

The first good news is that good rate functions are preserved tmder con- 
tinuous transformations. This elementary result gives probably one of the 
simplest and most powerful ways to transfer an LDP. 




10.2 Some Preliminary Results 345 



Theorem 10.2.1 Let ir : Mi M 2 be a continuous function between a 
pair {Ml, M 2 ) of Hausdorff topological spaces. Their -image Hon~^ : M 2 -> 
(0, 00 ] of a good rate function /f : Mi -4 [0, 00 ] ts o good rate function. In 
particular, if H governs the LDP associated with a collection of measures 
, then the ir -image measures o tt"' satisfy the LDP with good rate 
function H oir~^. 

Proof: 

Since H is a good rate function, the infimum in (10.15) is obtained at some 
point. Using the fact that ir is continuous, we conclude that the level sets 
H"^([0,o]) = 7r(/f"^([0,a])) C M 2 are compact. On the other hand, for 
any open (resp. closed) set ^4 C M 2 , the set Jr“'(i4) C Mi is open (resp. 
closed). FVom (10.16), we easily check that ott"^ satisfies the LDP with 
rate Ho TT"^. ■ 

We observe that a lower semicontinuous function V satisfies at every point 
u G M 

U(u) = sup{U(j4) : ue A, A open} (10.17) 

Also notice that the supremiun in the display above can also be taken over 
any open neighborhood topological basis. In this case, by the topological 
regularity of M, for any u € M with U(«) < 00 and for any e > 0 there 
exists a pair of neighborhoods A,Bofu such that x € B C A and 

V(u) > V(B) > U(A) > V{u) - e 

We quote the first elementary but reassuring result of the theory of large 
deviations. 

Proposition 10.2.2 For any sequence P^ of probability measures on a 
Hausdorff and regular topological space, there is at most one rate function 
governing the LDP of P^ . 

Proof: 

Suppose there were two such rates Hi and H 2 and some point u € M on 
which Hi{u) > H 2 {u). FVom previous comments and because of the lower 
semicontinuity, there exists a neighborhood B of u such that x € B and 

Hi{u) > Hi{B) > Hi{u) - e 

Since the functions Bi, H 2 govern the LDP, we also have 
-Hi{B) > -H2{B){> -H2 {u)) 

This would imply that Hi{u) < H 2 {u) + e for amy £ > 0, yielding a contra- 
diction with our assumption Hi{u) > H 2 {u). ■ 

We end this section with two interesting properties of rate functions and 
sequences of measures on a metric state space. The first one provides some 




346 10. Large-Deviation Principles 



nice topological regularity properties of good rate functions with respect 
to open and closed neighborhoods. 

Proposition 10.2.3 Let H be a good rate function on some metric space 
(M, d). For any u€ M and e >0, we have 

lim = H{u) = lim e)) 

where B{u,e) denotes the closure of the open bdl B{u,e) = {v € M : 
d{u,v) < e}. 

Proof: 

Because of the l.s.c., we have H{u) = linieio H{B{u,e)). To check the first 
equality, it clearly suffices to prove that for any e' > 0 

Oe' =def e' + lim H(B{u, e)) > H{u) (10.18) 

When the r.h.s. term is infinite, the result is trivial. Otherwise, because 
the level sets of H are compact, Cc/(ti,5) = B{u^e) fl /f~^([0,ac']) forms a 
decreasing sequence of nonempty compact sets (as e 1 0) and 

Oc>oC'c'(u,£) = Cc'(u, 0) = {u} G H ^([0, a^/]) (10.18) 

This ends the proof of the proposition. ■ 

In the next proposition, we alrecwly present a universal way to obtain a 
weak LDP and identify the rate function. 

Proposition 10.2.4 Let he a sequence of distributions on some metric 
space (M, d). Suppose that for any ue M we have 

Umliminf ~logP^(B(u,e)) = -H{u) = hmlimsxxp^ log P^{B{u,e)) 
€-¥0 N-¥oo N ^ 

(10.19) 

for some function /f ; M [0, oo). Then H is Is.c. and it governs the weak 
LDP of the sequence P^. Inversely, if a good rate function H governs the 
LDP of some sequence of distributions P^ , then (10.19) holds true. 

Proof: 

For any open set >1 C M and u £ A, there exists some £o > 0 such that 
B{u, e) C A for any 0 < e < £o- Under our assumptions, we find that 

hminf logP^(B(ii,€)) < lim inf logP^{A) 

N-¥00 iV N-¥oo DI 

Letting e -> 0 and taking the supremum of {-H)(u) over A, we conclude 
that H governs the LDP lower boimd. Let be an approximation sequence 




10.3 CrAmer’s Method 347 



of u such that d{un,u) < !/«• Under our assumptions, for any 6 > 0 there 
exists some ng such that for any n>ns 

liminf \ogP^{B{u, 1/n)) < -H{u) + 5 
N-*oo N 

On the other hand, for each n there exists some sufficiently large rUn such 
that B{un, 1/m) C B{u, 1/n) for any m > m„, 

liminf logP^(B(u„, 1/m)) < -H{u) + 6 
N-^oo N 

Letting m oo, we conclude that for any n > n« we find that 

H{un)>H{u)-6 

We conclude that liminf„_>ooiif(«n) = supj>o H{un) > H{u) and 

consequently B is an Ls.c. function. For any open e-covering of a compact 
set j 4 C U„g/iB(ti,e), we extract a finite covering 

A C Ui<i<dB(uj,e) C Ui<i<dB(«i,e) 

and by the union of event bounds we find that 

limsup-^logP^(i4) < - inf -limsup^logP^(B(«<,e)) 

N-^oo ^ !<»<'' AT-+00 ^ 

Letting e ^ 0, we conclude that 

limsup-^logP^(i4) < — inf H{ui) < -H{A) 

N-*oo ^ !<»<«< 

By Proposition 10.2.3, the last assertion is a direct consequence of the def- 
inition of the LDP. This ends the proof of the proposition. ■ 



10.3 Cramer’s Method 

As the reader may have noticed, proving LDP lower bounds is a very chal- 
lenging question. Various rough exponential upper bounds have already 
been analyzed in Section 7.4 in the context of particle approximation mod- 
els of Feynman-Kac distributions on some measmrable spaces (B„, £„). For 
instance, for strictly positive potential functions, from the exponential es- 
timate (7.28) stated in Corollary 7.4.3, we find that 

itasup i logp;;; (i-^ (/»> - -i.(/n)i > e) < 




348 10. Large-Deviation Principles 



for any measurable functions R such that osc(/„) < 1, and with 

n 

6(n) < 2 Tq^n 0{Pq,n) 

9=0 

To get one step further in oiur discussion, we recall that these crude 
estimates were essentially obtained applying the Chemov-Hoeffding expo- 
nential inequality (see Lemma 7.3.2) to the random variables {f„) for 
each test function /„. One natural strategy to improve these estimates 
and obtain more precise exponential upper bounds that are valid for any 
test function is to analyze a “uniform version” of the Chemov-Hoeffding 
inequality. This idea goes back to Cramer [62] and is not restricted to par- 
ticle approximation models. Because of its importance in practice, we have 
chosen to present this strategy in an abstract and general framework. 

The forthcoming strategy we are about to present needs to strengthen om 
topological conditions and to impose some additional algebraic stmcture to 
the set M. In the further development of this section, we suppose that M is 
a Hausdorff topological vector space (recall that these spaces are necessarily 
regular). In analogy with the exponential moments used in the proof of the 
Chemov-Hoeffding inequality, we introduce the following. 

Definition 10.3.1 The asymptotic logarithmic moment-generating func- 
tion of a sequence of distribution on M (with topological dual M*) is 
the function A defined as follows 

A-.VeM* ^ A(V) = ^li^ ^ log J P^{dm) G R U {oo} 

Proposition 10.3.1 The function A and its Legendre-Fenchel transfor- 
mation A* defined by 

A* : m € M A*(m) = sup (< V, m > -A(V)) € [0, oo] 

V€Af* 

are convex functions. In addition, A* is Ls.c., and for any compact set 
Ac M we have 

limsup-^logP^(i4) < -A*(i4) (10.20) 

N-*oo 

Proof: 

The convexity of A is easily checked using the Holder inequality, while 
the convexity A* is a simple consequence of its definition. The l.s.c. of 
A* is proved by recalling that the supremum of a collection of continuous 
functions is Ls.c. Suppose that A* (m) > > 0 for any m € A and some 

> 0; otherwise, we have A* (A) = 0 emd the result is trivial. Using similar 
arguments and by definition of A*, we can associate with each m € A/ a 
point Vm e M* such that 

<U,„,m>-A(V;„)>A*(m)-<J 




10.3 Cr&mer’s Method 349 



Since Vm € M*, we also have sup^/g^^ < Vm, {m - m') >< 5 for some 
neighborhood Am € M of each point m. By the exponential version of 
Markov’s inequality, we find that 

JAm 

Jm 

»>) f gN <V„,m'> 

Jm 



^ gN(<5-<V„,,m; 



from which we deduce that 

i logP^(^^) < < V;„,m > +1 log^ P^(dm') 

The set i4 C Um€A-^m being compact, we can extract a finite cover A C 
Ui<i<d>lm< and imder om: assumptions the estimate above implies that 

limsup-^logP^(.4) < inf (<V;„.,mi>-A(V;„J) 

N-¥oo ^ 1<‘<4 

< 2(J- inf A*(mi) < 2(J - A*(A) 

Taking 5-4 0, the end of the proof of the proposition is completed. ■ 



In many interesting practical situations, the Legendre-Fenchel transform 
of the asymptotic logarithmic moment-generating function coincides with 
the rate function governing the LDP of the sequence P^. The first objective 
is to extend the LDP upper bounds (10.20) to any closed subset. Intuitively 
speaking, this extension should be possible as soon as the probability mass 
of the sequence is exponentially concentrated on a compact set. The 
right notion that makes this result precise is the following. 

Definition 10.3.2 H^e say that a sequence of distributions is expo- 
nentially tight if for any a > 0 there exists a compact set Aa such that 

Urn sup logP^(Af - Aa) < -a 
N-^oo Af 

As an aside, when P^ is exponentially tight, an LDP lower bound would 
imply that 



—a > limsup log P^(M - A„) > -H{M — A„) 

N-^cx> Jy 

In this case, this shows that all the level sets P“^([0,a]) C A„ are neces- 
sarily compact. 




350 10. Large-Deviation Principles 



As mentioned above, another important consequence of the exponential 
tightness property is that the proof of the LDP upper boimd for closed sets 
reduces to that of the LDP upper bound for compact sets. 

Proposition 10.3.2 Let be an exponentially tight sequence of distri- 
butions on an Hausdorff topological space M equipped with the Borel a -field 
B(M). A function H governs the LDP upper bounds for any closed sets as 
soon as it governs the LDP upper bounds for all compact sets. 

Proof: 

To prove this assertion, we simply observe that for any closed set B € 
{M - ff~^[0,£]), for some e > 0 (otherwise we have H{B) = 0 and the 
desired implication is trivially checked), we have 

P^{B) < P^{Ae n B) -t- P^{M - Ae) 

When the LDP upper bound holds true for the compact set B f1 we 
conclude that for any H{B) > e 

limsup ^ logP^(B) < (-B(Ae n B)) A (-e) = -e 

N-*oo ^ 

Taking the infimum of (-e) over all e < H{B) yields the desired upper 
bound. ■ 

To illustrate this notion of exponential tightness, we come back to par- 
ticle approximation models of Feynman-Kac distributions and the crude 
exponential upper-bound presented in Section 7.4. We further assume that 
En are Polish state spaces equipped with the Borel sigma-field €n and the 
potential functions are strictly positive. In this situation, we recall that 
V{En) equipped with the weak topology is again a Polish space. By a the- 
orem of Prohorov, a closure A„ of a set A« C V{En) is compact if and 
only if the set A„ is tight; that is, if for each > 0 there exists a com- 
pact set Cn C En such that inf„,„e/i„ TfiniCn) > 1 - <Jn- Using the fact 
that any single probability measure on the Polish space is tight, for any 
> 0 we associate with the Feymnan-Kac distribution rjn & compact set 
Cni^n) C En such that T)n{Cn{Sn)) > 1 “ ^n- The reader will immediately 
notice that the sets 

An{‘nn,Sn) = {mneV{En) : m„(C„((J„/2)) - t;„(C„(( 5„/2)) > -(J„/2} 
are compact and by construction we have 

inf m.n{Cn{Sn/^)) ^ 1 ~ <Jn 

€ -An (f/n ) 



and 

A^niVn,Sn) C {tHn € 7>(B„) : |»?„(C„(J„/2)) - m„(C„(rf„/2))| > 6n/2} 




10.4 Laplace- Varadhan’s Integral Techniques 351 



Using the exponential estimates (7.21) presented in Theorem 7.4.3, by the 
union of event boimds, we find that 

for sufficiently large N and for some finite constants c(n) and d(n) < oo 
whose values only depend on the time parameter. This proves that the 
laws of the particle density profiles (»j^, . . . , t;^) are exponentially tight in 
^n(E) equipped with the product topology, and by Proposition 10.3.2 the 
an{d)rsis of the LDP upper bound reduces to that of the LDP upper bounds 
on compact sets. 

Proving the LDP lower bounds is often a delicate problem. Several dif- 
ferent strategies have been presented in the literature. One possible route 
is to follow the proof of Cramer’s theorem on the LDP for independent and 
R^-valued random variables. This technique has been initiated by Gartner 
and Ellis in the context of non-iid and R*^-valued random sequences. It has 
been further extended by Baldi to any sequence of Af-random variables. 
We refer the reader to [112] for more precise informations as well as for 
a complete list of references. This strategy gives fruitful results as soon 
as the asymptotic logarithmic moment-generating function has some nice 
regularity properties, and only if the expected rate function is convex. The 
proof of this result depends on several deep results on convex analysis that 
are not discussed here and thus it will be omitted. For more details, we 
refer to Corollary 4.5.27 in [112]. The next theorem is due to Dembo and 
Zeitouni. It is a somewhat weaker version of Baldi’s theorem for Banach- 
valued random sequences, but it provides a precise connection between the 
regularity of A and the desired LDP lower bounds. 

Theorem 10.3.1 (Baldi, Dembo-Zeitouni) Let be an exponentially 
tight sequence of probability measures on a Banach space M. Suppose that 
the asymptotic logarithmic moment-generating function A : M* R is 
finite, is.c. with respect to the M-topology on M*, and Gateaux differen- 
tiable; that is, e A(V^i + €.¥ 2 ) is differentiable at e = 0 for any pair 
Vi, V 2 € M*. Then satisfies the LDP with the good rate function A*. 



10.4 Laplace- Varadhan’s Integral Techniques 

In Section 10.2, we saw that LDP and good rate functions are preserved un- 
der continuous transformations. This property allows transfer of LDP from 
a sequence of distributions P^ into another as soon as is the image 
measure of P^ with respect to some continuous transformation between 
the state spaces. This quite elementary result is often used in practice and 
provides a simple way to transfer LDP when the corresponding random 




352 10. Large-Deviation Principles 



variables are connected by some regular transformation. In this section, 
we examine the situation in which the structural connection above is re- 
placed by an integral continuity property. The first outstanding result in 
this direction is the following theorem due to Varadhan and often called 
Varadhan’s or the Laplace- Varadhan Lemma in the literatmre on LDP. 

Theorem 10.4.1 (Laplace- Varadhan) Let M be a Hausdorff and regu- 
lar topological space equipped unth a a -field cr(M) D B{M). Suppose that a 
sequence of probability measures € V{M), N >\, satisfies the LDP on 
M uiith a good rate function H : M -¥ [0,oo]. Then, for any bounded Ls.c. 
function V : M -4 R and for any open set A, we have 

liminf log / dQ^ > sup (V - H) (10.21) 

Af-»oo N Ja a 

In addition, for any bounded u.s.c. function V : M R and for any closed 
set A, we have 



limsup log f e^^ dQ^ < sup(V - H) (10.22) 

N-kx> J a a 

Inversely, if (10.21) and (10.22) hold true for some Ls.c. function H, then 
satisfies the LDP on M with a rate function H : M [0,oo\. 

Proof: 

We first prove the lower boimd. Since V is an Ls.c. function, we recall 
that V(«) = sup{V(B) : u&B, B open C M}, for every u € M. Con- 
sequently, for any u & A and e > 0, there exists an open neighborhood 
B C Aoiu such that V{u) <e-\- V{B). It follows that 

1b) > Q^(B) 

Hence it follows that ^logQ^(lyi e^'^) > (V(u) - e) -I- ^logQ^(B). 
Taking into account the LDP lower bound on open sets B, we also have for 
any u€ B 

limiM i logQ^(U e^"') > (V(«) -e)- H{B) > {V{u) -e)- H{u) 

Taking the supremum over all points ue A and letting £ -4 0 we find that 

liminf logQ^(U e^'^) > sup (V - H) 

To prove the upper bound, we fix some £ > 0 and an arbitrary b<oo and 
we cover the sets Ab = i4nB~^([0, b]) with a collection of sufficiently smedl 
neighborhoods B{u), u£ Ab such that 

sup V < V(u) -I- £ and inf B > H(u) - e 
B(uj S(“) 




10.4 Laplace- Varadhan's Integral Techniques 353 



Taking into accoimt that A is closed and the level sets are compact, the 
trace set A\, is also compact and we can extract a finite cover At C Bn = 
UjLiB(x») and hence 

1=1 

It now follows from the LDP upper bound that 
limsup logg^(e^'^ U) < V|Li ([V(u<) + e]- 

N-yoo ^ 

H(B(ur)))wm-H{K)) 

< V?=i((V(«,)-i?(«i) + 2el)V(m|-6) 

< (sup[7-/f] + 2£)V(||V||-6) 

A 

We end the proof of (10.22) by letting {e,b) -4 (0,oo). The last assertion 
is a simple consequence of the definition of the LDP. ■ 

As we already mentioned in the introduction, the Laplace- Varadhan inte- 
gral lemma can also be regarded as a powerful change of reference prob- 
ability technique that allows transfer of large-deviation principles from a 
sequence of probability measures to another P^. The integral connec- 
tion usually consists of a pair of absolutely continuous measmes 

such that 

llpN 

— =exp{NV) Q^-&.e. (10.23) 

for some measurable mapping V : M When V is bounded continuous, 
the theorem above allows transfer of an LDP on to the sequence P^. 
Rephrasing Theorem 10.4.1, we can state the following corollary. 

Corollary 10.4.1 Assume that satisfies an LDP with good rate func- 
tion if : M -4 [0,oo]. If P^ satisfies the continuity condition (10.23) for 
some V 6 Ch{M), then it satisfies an LDP xvith good rate function (H-V). 

The pair of distributions {P^ ,Q^) is frequently defined in terms of the 
image measures 

P^=?^oir~^ and = 

for some probabilities jmd on some measurable space which 
may depend on N and for some measurable mapping ttat : -4 M. 

Observe that if and are absolutely continuous and for Q^-a.e. 

^(x) = exp{NV{nN{x))) 



(10.24) 




354 10. Large-Deviation Principles 



then the probability images and are absolutely continuous, their 
Radon-Nikodym derivative satisfies (10.23), and the Laplace- Varadhan lemma 
applies as soon as V is a bounded continuous mapping. 

When (M, d) is a metric space, the author and T. Zajic have recently pre- 
sented in [109] a new strategy to relax the analytic representation (10.24). 
The idea consists in replacing and by a pair of sequences P^„, 
and indexed respectively by a parameter pair (o, m) with a € R and 
m G M and by a parameter m G M. Instead of (10.24), we suppose that 
for any index pair (o, m) G (R x M) we have P^„, ~ and for 
X G 

dP^ 

-^^(x) = exp {N[aSN{x, m) -I- Va{nN{x),m)]) (10.25) 

for some measurable functions Sn • x M R and 14 : M x M -> R. 

We also assume that Pj[„, is independent of m and denote the former by 
For any (o,m) G (R x M), we define the image measures 

Pa,m=Km°'^'N Qm = Qm ° 



Lemma 10.4.1 Suppose the sequence of probability measures satisfies 
an LDP with good rate function Hm • M [0,oo] for each m € M. Also 
assume that the mappings Va(‘,Tn), a G R, are continuous at each m, 
Va{m,m) = 0, and the exponential moment condition 

lim sup log / exp nlV[5jv(x, m) + Vi (tta^ (x), m)| dQ^{x) < oo 

N-K30 Jnff 

(10.26) 

holds for some (m,n) G M x (l,oo). Then Pf’ satisfies an LDP with good 
rate function 

I :m€ M -> I{m) = Hm[rri) G [0, oo] 

As we shall see later, this integral transfer lemma is a natural tool for 
studying the LDP of mean field interacting particle models. It has been 
applied with success in [109, 110] to continuous time amd McKean Vlasov 
type particle models. It will also be central in the proof of Theorem 10.1.1 
provided in Section 10.7. 

Before getting into the strategy of proof of this lemma, it is convenient 
to better connect this result with Theorem 10.4.1. To this end, we first 
observe that condition (10.26) is met as soon as the functions Vi(.,m) and 
V„(.,m) are bounded for some (m,n) G M x (1, oo). To see this claim, we 
note that 

/ expnAT[57v(x,m)-|- Vi(7rAf(x),m)] dQ^(x) 

= / expAr[nVi(7TAr(x),m)- V„((7T;v(x),m))] dP^(x) < 

Jnf 




10.4 Laplace- Varadhan’s Integral Techniques 355 



Also notice that when = 0 are the null mappings, we have for any 
u € M and N > I, = expiV[Vi(u,m)|. If Vi(-,m) is continuous 

and (10.26) holds, then the family of probability measures 
satisfies the LDP with rate function I = Hm - l^i(-,ni). In the case where 
Fi(-,m) is continuous and (10.26) holds for all m, since Vi{u,u) = 0 we 
conclude that J(u) = Hu{u). 

The following technical proposition states the exponential tightness prop- 
erty and two key estimates needed for proving our result. 

Proposition 10.4.1 Under the assumptions of the integral lemma above, 
the sequence of probability measures on M is exponentially tight. For 
any Borel subset Ac M and for any l/n + 1/n' = 1, 1 < n,n' < oo, and 
m€ M, we have 

Pi^(A) < QW^' exp[7Sr<J„(m,A)l (10.27) 

Q^{A) < P/"(A)i/" P^(n),miAf'''' exp[A^5«(„)(m,A)/n] 

(10.28) 



with a{n) = -n'/n and for any a ^ 0 

Sa{m,A) = sup \Vi{u,m) - Vc{u,m)la\ 
ueA 

Lemma 10.4.1 is an almost direct consequence of this proposition. 

Proof of Lemma 10.4.1: If we take in (10.27) the closme of the ball 
of radius e and center m € M, that is 

A = B{m^ c) = {u G M : d(u, m) < e} 

we find that for any conjugate integers l/n 4- l/n' = 1 with 1 < n, n' < oo 

P/^(B(m,c))< Q^[(B(m, €))'/”' exp [^r^„(m,:B(m,£))] 

Recalling that {Q^ ; N >1} satisfies the LDP with a good rate function 
Hm, this implies that 

\ims\ip ^ log Pi (B{m,e)) < Hm(B{rn,e)) + Sn{m,B{m,€)) (10.29) 
N-*oo •'V n' 

Since Hm is a good rate function, by Proposition 10.2.3 we find that 

/(m) = Hm{m) = lim Hm(B{m,e)) 

€—►0 

Since each mapping Vn{»,m) : M R \s continuous at the point m and 
Ki(m, m) = 0, by the definition of 5n we also have that 

lim 5n{m, S(m, e)) = 0 




356 10. Large-Deviation Principles 



Taking first the limit c 4- 0 aod then n' 1 in (10.29), we find that 

lim lim sup log c)) < -/(m) (10.30) 

N-^oo N 

Now if we take in (10.28) the open ball 

A = B{m, e) = {u € Af : d(u, m) < e} 

we get 

Q^{B{m,e)) < P/^(B(m,c))‘/" exp[AT (Ja(n)(m,P(m,e))/n] 

Our assumptions on imply that 

< -Hm{B{m,e)) < liminf ■^logQ^(P(m,e)) 

AT-+00 iV 

Arguing as above, this implies that 

-/(m) < liminf-^logQ^((B(m,c)) 

N-^oo IS 

< - liminf-^logP/^(B(m,c))-|-(Ja(„)(m,B(m,€)) 
n [ N-^oo IS 

Considering the limit 6 4- 0, one obtains for any n > 1 

— n I(m) < lim lim inf log P,^(B(m,c)) 

Letting n 1, we get from (10.30) 

limlimsup-^logP,^(P(m,e)) < -J(m) < limliminf -^logPi^(B(m,€)) 
e-^O f/-¥oo N <->0 AT-foo iV 

Since we clearly have 

lim lim inf log P/^(B(m, c)) < lim lim sup ^ log Pi^(B(m, c)) 

<-f0 N-¥00 N €-^0 •” 

it follows that for any m€ M 

lim lim sup log P/^(B(m,c)) = (-/)(m) = limliminf •^logPi^(B(m,€)) 

<-f0 7V-KXI N €-¥0 iV->00 N 

By Proposition 10.2.4, we conclude that / is an l.s.c. function and it gov- 
erns the weeJc LDP for P^. Since the sequence P/^ is exponentially tight, 
we recall that the weak LDP is equivalent to the full LDP and the proof of 
the lemma is now completed. ■ 

We now come to the proof of the technical proposition. 




10.4 Laplace- Varadhan's Integral Techniques 357 



Proof of Proposition 10.4.1: Fixing n > 1 so that (10.26) holds and 
denoting the left-hand side of (10.26) by ncn, we have, for N large enough, 

/ (&) ^ jf^expnAr[Sjv(x,m) + Vi(7rAf(x),m)] dQ^{x) 

< exp(nCnAT) (10.31) 

Since each probability is a tight measure on M and the sequence Q^, 
N >1, satisfies a full LDP, one concludes that is exponentially tight. 
For any o < oo, there exists a compact set K{m, a) C M such that 

lim sup log (K'^{m,a)) < -a with K^{m, a) = M - K{m, a) 

N-^oo ^ 

To prove that is exponentially tight, we first note that 
Pl’{Kim,a)) = Pf"(lKS(m,a)(^Ar(x))) 

with 

1 + ^ = 1, and if® (m, a) = if ®(m, n'(c„ + o)) 
n u 

Thus, using Holder’s inequality, we check that 

< 0«(jr;(m.a))V“'Q;^((g)")" 

< exp(< 4 ,W) 

Recalling (10.31), the estimate above implies that 

limsup^logP/^(if®(m,a)) < [n'(c„ + a)) + c„ = -a 

N-y<x> jV n' 

This clearly ends the proof of the exponential tightness of the sequence P^ . 
In the same way, for any Borel subset Ac M and for any 1/n + 1/n' = 1, 
1 < n, n' < 00 , and m G M, we have 

< QW"' Qi:((uo„) 

xQ^ {{U o itn) exp {nN[SN{x, m) + Vi(irAr(x), m)]))*/” 




358 



10. Large-Deviation Principles 



Since we have 

Q^((l>i otta^) exp (niV[5N(.,m) -1- Vi(7rjv(.),m)])) 

= Q^((U ojtat) exp(iV(nSAr(.,m) + V;,(7rAf(.),m)]) 

X exp(-Ar[V;(7rAr(.),m) + nVi(7rAr(.),Tn)))) 

< exp(iVsup„g^|nyi(«,m)- V;(u,m)|) 

= Pn,m{A) exp{nNSn{m,A)) 

we find that P^'iA) < {Af>^ e(wi„(m,>i)) This establishes 

(10.27). To prove (10.28), we first use the decomposition 

U(5rAr(x)) = lyi(7rjv(x)) exp ^^(5Ar(x,m) H- Vi(7rAf(x),m)]^ j 






1a(^n( 2:)) exp ( [5/v(x,m) + Vi(7Tiv(x) 



.Hi)] 



and Holder’s inequality to prove that 

Q^{A) 



< (1a OTT^ exp(iV[5Ar(.,m) -t- V'i(7rjv(.),m)]))^^” 

xQ^ exp ^-iV^[5iv(.,m)-l- Vi(7rAr(.).»^)]^^ 

= Pi^(yl)V” 



xQ^ ^Iaoitn exp^-iV^[5Af(.,m)-|-l^i(7rjv(.),m)]^^ 
We finally observe that 

Qm exp ^-AT^[5Af(., m) -f V^l(7rAf(.), Tu)]^^ 

= (1a°^\ exp(Na(n)[Sf/(.,m) -I- Vi(7rA,(.),m)])) 



(10.32) 



= Qm(Uo’TAr exp(7V(a(n) Sw(.,m) -I- FQ(„)(7ryv(.),m)]) 

X exp(iV[a(n)Vl(7r7v(.),m) - K,(„)(jrAf(.),m)])) 
^Pa{n),mi^) ^ BXp {N\a{n)\ Sa(n){m, A)) 




10.5 Dawson-Gartner Projective Limits Techniques 359 



and from (10.32) we obtain 

This establishes (10.28), and the proof of the proposition is now completed. 



10.5 Dawson-Gartner Projective Limits Techniques 

In this section, we present another powerful and natural method of lifting 
LDP on finite-dimensional spaces to infinite dimensional ones. This projec- 
tive limit approach to LDP is due to D. Dawson and J. Gartner [73, 74] 
and it has been further developed by A. de Acosta in a series of three 
articles [1, 2, 3j. As we shall see, this technique can be interpreted as an 
extended version of the contraction principle to projective limit spaces. The 
idea is the following. 

Definition 10.5.1 Let M be a given set and let 
M = {{Mu,Pu) : U€U} 

be a collection of topologiccd spaces Mu and maps pu ■ M Mu indexed by 
a setU. The projective limit topology of M determined by M is the topology 
generated by the collection of open sets (py^(A) : A open C Mu}- 

By a direct application of the contraction theorem (Theorem 10.2.1), we 
prove the following proposition. 

Proposition 10.5.1 Let M be a topological space equipped with the pro- 
jective limit topology determined by a collection of topological spaces and 
maps (Mu,Pu) indexed by some set U €.14. 

• For any U €14, the pu -image H of a good rate function H on 
M is a good rate function on Mu- 

• Assume that a sequence of distributions satisfies the LDP on M 

with the good rate function H. Then, for any U € 14, the pu-image 
measures satisfy the LDP on the topological space Mu with 

good rate function H op^^. 

Definition 10.5.2 A directed set is a preordered set {14, <) with the fol- 
lowing property: For any pair {U, V) €14^, there exists some W €14 such 
that U <W andV <W. 

Definition 10.5.3 Let {14, <) be a directed set and let M = {Mu • U € 
14} be a family of Hausdorff topological spaces indexed by 14. For each pair 




360 10. Large-Deviation Principles 



of indexes V <U, assume that there are given a collection of continuous 
mappings puy : Mu — > My mth pu,u = ^d satisfying the compatibility 
conditions 

U >V >W => pu,w = Pu,v ° Pv,w 

Then the family {Mu,Pu,v)u>v w called a projective (or inverse) spectrum 
ofU with spaces Mu and connecting mapspuy. 

The fact that the maps puy go in the opposite direction of the order is 
clearly mathematically irrelevant. Furthermore, since we have assumed that 
Puy - Id, “identifying” U with Mu, the set {U,pu,v)u>v is itself the pro- 
jective (or inverse) spectrum of W. We have done these two choices for later 
convenience. We denote by pu the canonical projection from flt/ew 
into Mu- 

Definition 10.5.4 Given a projective spectrum (Mu,puy)u>v ofU, we 
introduce the product space Oi/ew <»»d, for each U, let pu be its pro- 
jection onto the U -factor Mu- The subspace 

lim/d = {m € My : 'iU >V pv{m) = Pu,v(pui^))} 

^ U&4 

is called the projective (or inverse) limit space of the spectrum- 

The product space Ht/ew ^ equipped with the product topol- 

ogy; that is, the weakest topology such that the projections pu are continu- 
ous. These maps are clearly continuous, and the Hausdorff property of the 
spaces Mu, U €.14, vs clearly transfered to linoj^ M- The Hausdorff property 
and the choice of a directed set (U, <) are also not innocent. They ensure 
that limi/ is a closed subset in Ht/ew and the relative topology on 
liniK M is generated by the sets 

{Pu^{A) : Mu -2 A open, U €U) (10.33) 

(see for instance Theorems 2.3 and 2.4 in Appendix 2, Section 2 in [126]). 
Notice that for any collection of closed sets Fu,U €14, such that pu,v{Fu) = 
Fv, as soon as 17 > V, the set F = C\u&iPu^ {Fu) is closed since it coincides 
with the projective limit ofthespectrum(Ft/,Pyy^),wherepyyr : Fu Fy 
stands for the restriction of the mapping puy to Fu- Since by Tychonoff’s 
theorem a product of compact space is compact, we finally note that 
limi/ M is compact as soon as all the sets Mu, U € 14, axe compact. We shall 
always assume that a projective Umit set M = lim« M is equipped with a 
<7-field a{M) such that a{M) D ^ueuPu^{B{Mu))- The next theorem is a 
slight modification of a theorem of Dawson and Gartner. 

Theorem 10.5.1 (Dawson-Gartner) LetM = \ivauM be the projective 
limit of the spectrum {Mu,pu,v)u>v of a directed set 14, and let H be a 
given function from M into [0, ooj. Then H is a good rate function on M if 




10.5 Dawson-Gartner Projective Limits Techniques 361 



and only if there exists a collection of good rate functions lu on each space 
Mu with U & U and such that 

H = sup {lu o Pu) (10.34) 

U&A 

A sequence of probability measures on M satisfies the LDP with some 
good rate function H if and only if the sequence of dl image measures 
o satisfies the LDP for some good rate functions lu with U €U. 
In each situation, the rate functions are respectively given by (10.34) ond 

Iu = HopJj^. 

Proof: 

Suppose that His a. good rate function on lim^/ A4. By a contraction argu- 
ment, we see that the image functions lu = Hop^^ are good rate functions 
on Mu- Using (10.17) and (10.33), we readily prove that 

H{m) = supsup{/f(pu^(v4)) : pu{m) £ A, open £ Mu} 
ueu 

= sup luipuim)) (10.35) 

ueu 

In the reverse angle, suppose that lu are good rate functions on Mu, and 
let H be the pointwise supremum of the maps Iu°Pu- Since each of these 
maps is l.s.c., the function H is l.s.c. To prove that H has compact level 
sets, again by the contraction theorem (Theorem 10.2.1), we notice that 
they have to satisfy the compatibility conditions 

U>V ^ Iv = Iuopu^ and /v'([0,a]) = py,v[/y'([0,a])] 

for any a £ [0,oo). Since projective limits are inherited by closed sets, this 
implies that /f“*([0,o]) = (YivauM) n projective 

limit of the compact level sets 7y^([0,a]), U £U. By Tychonov’s theorem, 
we conclude that the level set /7“^([0,o)) is a compact subset of YvcouM. 
This proves that H is a. good rate function as soon as each lu is a good 
rate function. 

Suppose that the sequence of measures op^^ satisfies the LDP upper 
bound with the good rate functions lu- Since the closure i4 of a measurable 
set j4 G <t{M) coincides with the projective limit of the closures Au of the 
sets Au = Pu{A), previous considerations imply that i4n7f~^([0, a]) is the 
projective limit of the compact sets Au n Jy^([0,a]) for 6iny a £ [0,oo). 
Now if H{A) = 0, the result is trivial. Otherwise there exists some a < oo 
such that a < H{A). Recalling that the projective limit of a collection 
of nonempty compact sets is nonempty, for any a < H{A) there exists 
some U £ U such that n /y^([0,a]) = 0 (otherwise we would get a 
contraction). Since A C pfj^{Au), applying the LDP upper bound for o 




362 10. Large-Deviation Principles 



we conclude that 

lim sup ^ log (A) < lim sup log (.4c; ) 

N-^oo N N-*oo ^ 

< —Iu{Au) < —a 

We complete the proof by taking the infimum of (-a) over all a < H{A). 
Suppose that o p^^ satisfy the LDP lower bound with the good rate 
function ly. For any A € <r(M) and m € A, there exists an open set 

Bu € Mu for some U £U such that m € Pu \Bu) C A C A. Appl)dng 
the LDP lower boimd for opy^, we find that 

liminf-^logQ^(i4) > liminf -^logQ^ opy^(Bc/) 

iV->oo iV N-^oo ly 

> —Iu{Bu) > —Iu{m) 

We readily conclude that satisfies the LDP on M with rate function 
H{m) = sapu^if luipui'fn))- In tb® reverse situation, suppose that 
satisfies the LDP on M for some good rate function. By the contraction 
theorem (Theorem 10.2.1), the image measures Q^opu^ satisfy the LDP on 
Mu with good rate function lu = Hopy^. Prom previous considerations, 
this implies that also satisfies the LDP on M with the rate function 

H(m) = sup /t/(pc/(ni)) 
ueu 

From previous arguments or by the uniqueness of the rate function (see 
Proposition 10.2.2), we conclude that H = H. This ends the proof of the 
theorem. ■ 

We end this section with a more or less well-known min-max type theorem 
for good rate functions on projective limit spaces. 

Theorem 10.5.2 Let H be a good rate function on the projective limit 
M = limj/ M of the spectrum {Mu,pu,v)u>v of a directed set U. We let 
lu = H opy^ be the pu -image function of H on Mu- For any closed set 
F G <t(M), we have 

H{F) = snpIu{pu{F)) (10.36) 

ueu 

Proof: 

By (10.34), we first note that (10.36) can be rewritten as 

H{F) = inf sup/c;(pc;(m)) = sup inf Iu(m) =def. H{F) 

ueUmepuiF) 



Since we have Iu{pu{fn)) > Iu(pu{F)) for any m G F, we d^uce that 
H{F) > H{F). To prove the reverse inequality, we assume that H{F) < oo; 




10.6 Sanov’s Theorem 363 



otherwise the desired bound trivially holds true. In this case, for each e > 0, 
the sets 

Pu^{pu{F))n{H<H{F) + e} 

are nonemp ty compact sets. The compactness results from the fact that 
Pu^(Pu{F)) is a closed set into the compact level sets {H < H{F) + e}. 
On the o ther ha nd, if these sets were empty, we would be able to find some 
m € Pu^{pu{F)) such that 

H{m) > Hipu^i^))) = Iu(^)) > H{F) + £ = sup Iv(^)) + e 

V&A 

This clearly yields a contradiction. Consequently, for each £ > 0, we have 

n {/r < h{F) + £} 7 ^ 0 

Since F coinci des with the projective limit of the sets (pu{F))ueUi we have 
F = r\ueuPu^(Pu{F))- FVom previous considerations, for any £ > 0 there 
exists some point rUe e F such that 

H{F) < H{m^) < H{F) + £ 

We end the proof of the theorem by letting £ -> 0. ■ 



10.6 Sanov’s Theorem 

10.6.1 Introduction 

Sanov’s theorem is probably one of the main startling results of large devi- 
ations and Monte Carlo approximation theory. This result provides sharp 
asymptotic exponential rates for the convergence of the occupation mear 
sures associated with a collection of independent random variables towards 
the limiting sampling distribution. 

The original proof of Sanov [281] assumes that the underlying random 
variables take values in R. Since this pioneering article, several extensions 
have been presented. In the book of Dembo and Zeitouni [112] the reader 
will find at least three different ways to prove this theorem in the context 
of random sequences taking values in Polish state spaces. Most of these 
strategies consist in deriving Sanov’s theorem as a consequence of more 
general LDP such as Theorem 10.3.1. 

In this section, we take a different perspective. We simplify the analysis 
and, speaking somewhat loosely, we show that proving the strong version 
of Sanov’s theorem in Hausdorff topological spaces is in fact equivalent 
to proving the corresponding statement in finite spaces. This original ap- 
proach is conducted applying the Dawson-Gartner contraction principle to 




364 10. Large-Deviation Principles 



a judicious and natural projective interpretation of the strong topology 
on the set of probability measures. Using these simplifications, the LDP 
upper bounds will be easily derived using the generalized Crimer method 
presented in Section 10.3. The proof of the LDP lower bounds is a little 
more delicate. It is conducted using an elegemt approximation technique 
essentially due to Groeneboom, Oosterhoff and Ruyggaart [170]. 

The projective limit approach to LDP for iid sequences in the r-topology 
can be conducted in various ways, depending on the projective interpreta- 
tion of the r-topology. Our strategy is based on a projective interpretation 
of set-additive and [0, l]-valued functions with respect to the class of finite 
partitions directed upwards by inclusion. In this interpretation, the LDP 
in the r-topology (for Hausdorff topological spaces) is essentially obtained 
from Sanov’s theorem on finite state spaces. We shall see that this state- 
space enlargement can be interpreted as a projective compactification of 
the set of probability measures. Another projective interpretation of the r- 
topology without enlarging the distribution space can be derived using the 
class of finite subsets of bounded measurable functions directed upwards 
by inclusion. This alternative approach to LDP was developed by A. de 
Acosta in [1] (see also [112]). 

10.6.2 Topological Preliminaries 

In the further development of this section, E denotes an Hausdorff topo- 
logical space equipped with a Borel sigma-field £. We let P{E), be the 
set of additive set functions from £ into [0, 1] and V{E) C P(E) be the 
subset of all probability measures on {E,£). We equip P(E) with the t\- 
topology of setwise convergence. More precisely, a sequence of set functions 
{fJ.n)n>o € P(jE)'^ Ti-converges to some n € P{E), as n oo, if and only 
if 

lim fin{A) = n{A) 

n-^oo 

for all A € €. It is readily seen that the r-topology of convergence on all 
Borel sets of E is the corresponding relative topology induced on P{E) by 
P{E). 

Let U be the set of all finite and Borel partitions of E. Since each U e 14 
is a finite partition, the a-algebra generated by U and denoted by a{U) is 
the finite set formed by 0, E, and the sets that are unions of elements of 
U. We slightly abuse the notation and denote by T{U) and Bb{U) the set 
of probability measures on {E,<r{U)) and the Banach space of all bounded 
and <T({7)-measurable functions on E (equipped with the uniform norm). 




10.6 Sanov’s Theorem 365 



Definition 10.6.1 Vft associate with each U €U the Kolmogorov-Smimov 
metric du and the U -relative entropy Enty(.|.) on ViJU) defined for any 
pair {p, v) e V{U)‘^ by the formulae 

dv{p,v) = 2-1 Y. HU')-u{U')\ 

l<i<d 

d 

Ent{;(/i|»/) = Y ^og{KU^)/u{W)) (10.37) 

«=i 

with the convention OlogO = 0 = 01og(0/0), and Entu{p\v) = oo os soon 
as Pit i'. 

As the reader may have noticed, (ViU), du)is& compact metric space. To 
be more precise, we note that for any d-finite partition U = € U 

the mapping p € 7^(11) -¥ (p(l/’))i<«<d ^ is clearly an homeo- 
morphism between V{U) and the compact (d — l)-dimensional simplex 
5(d) = {a € [0, 1]** : Yli=i “(0 = 1} Cl [0, 1]**. Since the identity mappings 

ev :ue {E,6) -*■ eu{u) = « € {E,cr{U)) 

are measurable, the set V{U) can alternatively be regarded as the set 
formed by all ey-images of set functions in P(E). More precisely, we have 
ViU) = qu(P{E)) with the continuous projection operators 

qu -pe P{E) quip) -pocy^ £ ViU) 

To clarify the presentation, for any pair of set functions ip, u) € P(E)*, 
sometimes we simplify the notation and we write duip,v) and Ent[/(p|i/) 
instead of duiqvip),quiv)) and Entuiquip)\quii'))- 

Note that Entt/(.|i/) is finite and continuous on the compact sets {t; € 
ViU) : T} « quit')} and Entt/(/i|.) is continuous for every fixed p. We 
later use, let us quote a technical lemma on the Lipschitz property of the 
mapping Entu(^|.). 

Lemma 10.6.1 Let U = (f/*)i<<<d 6 W, A € ViU) and let p > 0 be a 
given parameter. Also let p, q, rf 6 VifJ) be such that 

p<t X, p X<q X, and p X<q' X 
Then we have the uniform dy- Lipschitz inequality 

|Enti/(/i|» 7 ) - Enti;(At|J7')l < duM) 

with the positive constant X*iU) = \-.X{Ui)>oXiU') > 0. 




366 10. Large-Deviation Principles 



Proof: 

Since A < and A < 77 ' we have /i « rj and fi < rf', from which we 
conclude that the two entropies in the display above are finite. We also 
note that 



Entt;(/i|,?) - EntuM) = 

»: m ( U *)>0 



Using the elementary inequality | logx - logy| < |x - y|/(x A y), which is 
valid for any x, y > 0, and recalling that /i < A we find that 



|Entu(7i|»?) - Entt;(/i|»/')l < ^ 



vm 

\{U') 



Tj'm 

xm 



The end of the proof is now clear. 



Definition 10.6.2 For any rj € 'P(E), we also denote by H{.\r}) the rela- 
tive entropy criterion on P{E) defined by 

ft € P(E) — >■ H{fi\r}) = sup Ent[/(/x|»j) € [0,oo] 

V&A 

We say that a partition U is finer than another V and we write U >V 
as soon as a{U) D cr(V). Note that (C/*)i<i<j > (V^i<i<r if and only if 
there exists an r-partition (ai)j<i<r of the set of indexes {1, . . . ,d} such 
that V' = for any t < r. Using the well-known variational formula 

of the relative entropy on P([/) we have 

Entu(i/|jj)= sup (i/(/) -logTf(expf)) (10.38) 

f€Bt(U) 

Hence it is clear that for amy fi,i^ e P(E) 

U > F => Entv(Mk) < Entu(/i|i/) and dv{p,i') <du{fi,v) (10.39) 

We associate with any pair of partitions U,V eU the smallest partition 
{U V V) such that a{U) V a{V) = a{U V V). This partition is simply defined 
by setting 

UvV = {AnB : {A,B)e{UxV)} 

Since for any A € U -we trivially have A = An E = UBgv(.d n B), we 
conclude that {UWV)>U and by symmetry arguments (f/VV) > V. FVom 
these observations, we conclude that {U, >) is a directed set. A natiural 
projective spectrum of Li is defined as follows: For any U > V, we observe 
that the identity mappings 



eu,v ■ X € {E,<t{U)) -»• eu,v{x) = x € (E,a(V)) 




10.6 Sanov’s Theorem 367 



are measurable and we have V{V) — pu,v{V{U)) with the projection op- 
erators 

Pu,v ■ P G V{U) -> puyip) = M o % V € V{V) 

Using (10.39), we find that 

U >V => V(/i,i/) G V{U) dv{jpu,v[p),Pu,v{v)) < du{p,v) 

from which we conclude that the connecting mappings puy are Lipschitz- 
continuous. 

Proposition 10.6.1 The set "P = {{P{U),du),pijy)i;>v forms a pro- 
jective inverse spectrum ofU with compact metric spaces {P{U),du) and 
connecting maps Puy- Tot h : WvouV P(jE?) bo the mapping that as- 
sociates to a point p. = {pP)u &4 G limi/ V the set function h{p) G P{E) 
defined for any A€£ by 



h{p)i^) = {A) 

where U €U is some finite partition of E stich that A G a{U). Then h is a 
homeomorphism between the compact spaces fimj/ V and P{E). In addition, 
is inverse mapping h~^ is given for any p G P{E) byh~^{p) = {qu{p))u& 4 - 

Proof; 

To see that h is well-defined, we first need to check that p^ {A) = p^ {A) 
for any pair of partitions U,V such that A G o{U) and A G o{V). This 
assertion is easily proved by noting that 

A G {a{U) n a(U)) => A e<T{UvV) 

and by the compatibility conditions in the definition of limi/ P we have 

^lus^v),u(A) = >1 = elJ^v)y(A) =► = p^{A) = p'^{A) 

To prove that h{p) is an additive set function on £, we choose a pair of 
disjoint Borel sets A, B and a pair U, V of partitions with A e U and 
B €V. Since A and B are disjoint, we have 

A = An{E — B) = Ar\{Ucev ,c^bC) 

= Ucev ,c^b{A DC) € a{U V V) 

and by synunetry arguments B G a{U V U). Since G P{U V U), this 

implies that 

h{p){AuB) = p^^'^'^\A[JB) = p^^'''^'>{A) + p^^'^^^B) 

= hip){A) + h{p){B) 

Let us prove that h is an injection. Let p,v e lim^ P be a pair of points 
such that h{p) = h{v). By the definition of h, we find that p^ = for 




368 



10. Large-Deviation Principles 



any U €U, from which we conclude that ft = i/. On the other hand, for 
any /x € P(£^), we have {qu{y))u &4 € limw “P and 

HiQu{fi))u€u){A) = qu{ti){A) = no e^^{A) = fi{A) 



for any A G o{JJ) for some U €U.V/e conclude that h is a bijective map 
from limi/P into P(iB) and = {qu{i>))u&t- 

It remains to prove that h and h~^ are continuous. To prove this final 
step, we observe that for any sequence of points in YaauV 

and n = ifjy)ueu € lim^P, we have the following series of equivalent 
assertions 



lim Wn = M ^ 01 V 

n-¥oo U 






Vtf 6 W lim /i^ = yF in (P(Cf), du) 

n— foo 

'i{A,U) 6 (5xW) s.t. A e<r{U) 

Um fJ’niA) = y^{A) 

n— foo 

Vi4 6 5 lim h(y„)(A) = h(y)(A) 

n-foo 

lim h(fi„) = h(y) in P{E) 



This ends the proof of the proposition. 



The next lemma provides a representation of the relative entropy on 
P{E) in terms of the relative entropies on the compact sets P{U), U £U. 
As we shall see, this characterization is a particular case of the formula 
(10.34) presented in Theorem 10.5.1. 

Lemma 10.6.2 The domain Dh(.|*j) = {a* ^ P(^) = H{y\ri) < oo} of 
H{.\q), t] £ P(E) is included in V{E), and for any y G ViE), we have 

H{y\r,) = Ent{y\v) (10.40) 



Proof: 

Formula (10.40) is well-known (see for instance Pinsker [267]). By the varia- 
tional formula (10.38), it suffices to check that \Ju^uBb{U) is a dense subset 
of Bb{E). Note for instance that for any / G Bb{E!) with 0 < /(x) < 1 we 
have 11/ - /nil < ^ with 

n 

/n = 5^ ^ l/->([</n,(i+l)/n)) € B{Un{f)) 

t=0 

and (/„(/) = (/~^([i/n, (i -I- l)/n)))o<»<n € U. Extending this observation 
to any / G Bb{E), we prove that DueuBb{U) is a dense subset of Bb{E) 
and we conclude that 

sup Ent(/(i/|7j) = sup sup (:/(/)- log 77(exp/)) = Ent(x/|r/) 

UeU UGU 




10.6 Sanov’s Theorem 369 



This ends the proof of (10.40). To prove that Dh,,, C ViE), we observe 
that 

H{fi\ri) < 00 3c < 00 : VC/ € U Ent(/(/i|»/) < c 
To take the final step, we first check that the set of measures 

{i/ € V{U) : Entu{u\ri) < c} 

is uniformly absolutely continuous with respect to qu{v) in the sense that 
for any £ > 0 there exists some 5 > 0 such that for any B G <r{U) and u 
such that Enti/(i/|»/) < c we have that v{B) < £ as soon as t/(B) < 6. To 
prove this claim, we use the fact that ulogu > -1/e to check that 

~ Ib 

< e/2 + (Entu(i/|» 7 ) + 1/e)/ log {e/25) < e 

as soon as (c + l/e)/log {e/2S) < e/2. Whenever H{fi\ri) < c < oo, we have 

91/ (m) e {i/ G V{U) : Enti/(i/|r;) < c} 

for any U £U. The result above readily implies that if H{fi\r]) < c, then 
for any £ > 0 there exists some (5 > 0 such that for any B £6 we have 

t]{B) < 5 => n{B) < e (10.41) 

Let (Bn)n>i be a sequence of disjoint Borel sets. Since q G ViE) is a 
(T-additive measure, for any ^ > 0 there exists some p > 1 such that 

p(ur=pBfc) = f;p(s*)<<y 

k=p 

If we take B = U^pBfc in (10.41), then we find that 

p-i 

0 < //(U^iiBfc) - ^ ^l{B,) = m(B) < £ 
k=l 



from which we conclude that /x is also (T-additive and /x G ViE). This ends 
the proof of the lemma. ■ 




370 10. Large-Deviation Principles 



10.6.3 Sanov’s Theorem in the r -Topology 

The next technical lemma provides large deviations probabilities for inde- 
pendent and identically distributed sequences on finite state space models. 

Lemma 10.6.3 Let S be a finite state space, and letY = (y*)i<»<Ar be 
a collection of N independent and S -valued random variables, identically 
distributed according to a measure g € V{S). We denote by m the mapping 
from the product space into V{S) that associates to each configuration 
X — (x*)i<i<Ar € , the empirical measure m(x) = ^ 53i=i Tor any 

p € m(5^) we have 

{N + < exp{iVEnt(/i|jj)} P,,(m(y) = p) < I 

Proof: 

This result is rather well-known, see for instance lemma 2.1.9 p. 15 in [112]. 
Its proof is rather elementary. We first notice that, for any y € m"* (p) with 
p 6 m{S^), we have 



= y) = n = expiV ^ p{u) log77(«) 

u€S u€S 



This yields that 

p,(m(y) = M) = p,(yem-n/x)) 

= expN^p{u)\ogg{u) (10.42) 

ues 



and |m“^(/i)| = Since P^(m(y) = /x) < 1 we find 

that 



|m <exp-iV^/x(u)log/i(u) 

u£S 

we conclude that P,,(m(y) = p) < exp {-lVEnt(/i|»j)}. On the other hand, 
recalling that Ent(/i|T/) > 0, we note that 



P,(m(y) = /x)<P^(m(y) = M) 

This yields that 

1 = P,(m(y) 6 m(S^)) < |m(5^)|P^(m(y) = p) 

Using the fact that any m{y) e m{S^) can be rewritten as m(y) = 

= y' - ■“}! + 1)'®', from 

which we conclude that 

+ exp -N /x(u) log /x(u) 

ti€5 

This, together with (10.42), ends the proof of the lemma. ■ 

With these prehminaries taken care of, we are now in a position to state 
and prove our first main result. 




10.6 Sanov’s Theorem 371 



Theorem 10.6.1 Let (X*)<>i be a sequence of independent, E-valued ran- 
dom variables, identically distributed according to a measure q € P(E). 
For any N, we denote by the law on V{E) of the N -empirical mea- 
sures jf Sx*- For any U £U, the sequence of distributions o 
satisfies an LDP on ifP{U),du) with the good entropy rate function 

lu = Entu{.\q) 



Proof: 

Let M{U) be the set of all signed measures on (t(U). Since each U = 
{U')i<d € W is a finite d-partition, we first observe that the sets Bb{U) and 
M{U) are homeomorphic to R** and the elementary duality mapping 

(/X,/) € {M{U) X Bb{U)) ~^< f,p>= f fix) p(dx) 

J E 

determines a representation of M{U)* as Bb{U). Let Ay : Bb{U) R be 

the logarithmic moment-generating function defined for any / € Bb{U) by 

# 

Ay(/) = ^logj e^ {Q^ o q-^){dp) = 1 logE(e^”*(^)(/)) 

= log Jj(exp /) = log gy (tj) (exp /) < oo 

where m{X) = ^ The LDP upper bound follows from the fact 

that V{U) is compact and Enty(.|Tj) is the Fenchel-Legendre transform of 
the function Ay; that is, we have that 

Enty(t'|»j) = sup (i/(/) -log»/(exp/)) 
feBUU) 

for any v G P(U). To prove the LDP lower bound, let A c ViU) be an 
open set such that Enty(A|j/) < oo (otherwise the proof of the lower bound 
is trivial). For any > 0, there exists a point pe A such that 

Enty(/x|r/) < Enty(A|T7) + S (10.43) 

Since p£ A and A is open in {P{U),du), there exists some e > 0 such that 

Vuifi,e) = {i'€V{U) : du{u,p)<e}cA 

Up to a change of index we suppose that pilP^) = VjLj/x(t/*)(> 0). We 
associate with p the Af-approximation distributions p^ G ViU) defined by 

.."((m = / if 

' ' 1 i-Et,'IW)l/w it i = d 

By construction we note that 

p^«pi«qviq)) and Enty(/i^|7j) < oo 




372 10. Large-Deviation Principles 



Since 

< id-l)/N 

l<i<d 

then we have € V[/(/i,e) for any N > Ni = {d- l)/e and therefore 

Qu'iiveViU) : duiv,fi^) = 0}) = {v€V{E) : du{v,li'^) = 0} 

C q^\Vu{^^,e)) 



By the definition of the empirical measure m{X), it is also clesir that 



m 



»=i 



By Lemma 10.6.3, we get 

> (^ + 1)-" exp{-^Entc;(M'"|T/)} 

The continuity of the entropy function Entu(.|»/) on the set of measures 
€ V{U) : fi « quiv)} now implies that, for any N > N 2 and some 
N2>1, 

Entu(/i^ I r]) < Enti;(/i | »?) -1- 15 
We finally conclude that for any N > {Ni V N 2 ) 

^\ogQ^{qu^{A)) > ^logQ^(gy*(Vt/(/x,e))) 

> - Enti/(>i|*?) - d - 26 

Letting N -¥ 00 and then 0 the end of the proof of the LDP lower 
bound is completed. This ends the proof of Theorem 10.6.1. ■ 



It is convenient at this stage to make a couple of remarks. First we 
observe that Theorem 10.6.1 can also be derived using Sanov’s theorem. 
Indeed the strong version of Sanov’s theorem implies satisfy the LDP in 
V{E) equipped with the r-topology with the convex and good rate function 
Ent(.|»j). Since for any finite Borel partition [7 6 W the projection operator 
qu : ‘P(E) -> ViU) is a r-continuous mapping by the contraction principle, 
we conclude that the measures satisfy the LDP on the compact 

topological space {'P{U),du) with the good rate function 

7t/(i/) = inf {Ent(/i|q) : neV{E) s.t. qu{y) = v) 

= Entu{t/\r}) 




10.6 Sanov’s Theorem 373 



To prove the last equality we first use Lemma 10.6.2 to check that ly > 
Enty(i/|tj). The reverse inequality is based on the fact that for any i/ € ViU) 
we have 

Ent(Q,(i/)|Tj) = Entuii'lv) 

where fl, is the mapping fi-om V{U) into qu^{V{U)) that associates with 
any i/ € V(U) the measure defined by the formula 



Cl^iv)(AnU 




ri{AnU^) u{U^)/r}{U^) if r]{W)>0 
v{AnU') if T}{W) = 0 



(oT A e € and 1 < i < d, where U = (i/‘)i<<<d. In the reverse angle, 
Sanov’s theorem can also be derived using Theorem 10.6.1. 

The end of this section is concerned with proving the following corollary. 

Corollary 10.6.1 (Sanov) Let (X*)i>i be a sequence of independent, E- 
vadued random variables, identically distributed according to a measure rj € 
V{E). For any N, we denote by the law on ViE) of the N -empirical 
measures jf sequence of distributions satisfies an LDP 

on V{E) equipped with the r-topology with the good entropy rate function 
Ent(.|r/). 



To prove that Theorem 10.6.1 implies Sanov’s theorem, we first extend 
to P{E) by setting Q^(P(E) — P(E)) = 0 for any N > 1. Then 
we let be the sequence of distributions on limi/ P defined by P^ = 
o h, where h is the homeomorphism fi’om limt/ P into P{E) introduced 
in Proposition 10.6.1. Since the canonical projection pu firom hmuP into 
P{U) can be rewritten aspy = qyoh, we find that P^ opjf^ = for 

any U €U. Combining Theorem 10.6.1 with Theorem 10.5.1, we conclude 
that the sequence of distributions P^ = oh satisfies the LDP on limj/ P 
with the good rate function 



/i € limP sup Entu(pu(/i)|»?) 

^ U&4 

Recalling that h : limt/ P -> P{E) is a homeomorphism and using the fact 
that py o h~^ = qy by the contraction theorem (Theorem 10.2.1), we find 
that satisfies the LDP on P(E) (equipped with the ti - topology) with 
the good rate function 



: p e P{E) —4 H{p\ri) = sup Enty{p\ri) 

ueu 

Recalling that Q^{P{E)) = 1, we also have Q^{A) = Q^{A D P(E)) for 
any Ti-open or closed subset A C P{E). By Lemma 10.6.2, we have 

C P{E) (10.44) 

This )delds that for any subset A C P{E) 

H{A\f]) = H{AnP{E)\T]) = Ent(ylnP(E) | q) 



(10.45) 




374 



10. Large-Deviation Principles 



Since the relative topology induced on V{E) by P(£?) is the r-topology, the 
T-open (respectively r-closed) sets are of the form AOiP{E) with A C P{E) 
Ti-open (respectively Ti-closed). Using the fact that satisfies the LDP 
on P{E) with rate function /f (. |tj), we deduce from (10.44) and (10.45) that 
satisfies the LDP lower and upper bounds met on ViE) for any r-open 
or closed subsets A C V{E). Since H{.\rf) is a good rate function on the 
compact set P{E) the level sets /f(.|»j)“^((0,a|) are n-closed and hence ri- 
compact in P(£?). Since we have proved that subsets of /f(.|rj)“^([0,o]) C 
V{E), we conclude that these level sets are also r-compact subsets of V{E). 
This completes the proof of Corollary 10.6.1. 



10.7 Path-Space and Interacting Particle Models 

To connect our McKean interpretation models and their particle approxi- 
mation schemes with the integral transfer analysis presented in Section 10.4, 
we shall adopt hereafter a simplified system of notation. The time horizon 
n and the initial distribution t}q are always fixed. We slightly abuse the 
notation and when no confusion can be made, suppress the correspond- 
ing superscripts. We also simplify notation and we write (K, Cl, E) instead 
of Efi). To each x — (xq, . . . , x^)i<^<^ € Cl and 

Xp = (x*)i<i<Af £ Ep, ve associate the empirical measures m(x) and 
m{xp) by setting 



1 TV j Af 

= and m(xp) = -5^A 



«=i 



N 



t=i 



10.7.1 Proof of Theorem 10.1.1 

For a fixed distribution rf on path space Cl, we denote = (Qe(,,))®^, 

the A^-fold tensor product of the measure Qe(ij). Let P„ a € R, be the 

collection of distributions on Cl^ defined by 

dpN 

■^^(w) = exp {N[aSN{i^, J?) + Ua(7r^(u;), ;?)]) 
for -a.e. w € Cl^ with 

” f 

5at(w,7/) = V / m{wp-i,Up){d{up-i,Up)) 

p=l *'Ep-ixEp 



X 




) (^p- 1 > » ) 





10.7 Path-Space and Interacting Particle Models 375 



and 







Up. 1 {dUp- 1 ) log {fip.i,Tjp.i){up-i)\ 

-1 

(10.46) 



Under the 7V-IPS model (^p)o<p<n is the 7V-interacting particle model 

associated with the collection of Markov transitions Kp°J from Ep-i into 
Ep defined by 

Kp°p{up.udup-i) 



1 

\t^iVp-i){'^p-i) 



dACp,p(up-i, .) . . 

.d/i^p,t/p_,(tip-i, •) , 



^ P , Vt >- 1 (^- 1 > 



It is convenient at this stage to make a couple of remarks. 



• First we observe that for a = 1 the transitions above are not indepen- 
dent on T). Consequently, do not depend on r], and the probability 
measure 

=def. 

coincides with the distribution of the ./V-particle model associated 
with the collection of Markov transitions Kp^p. 

• We also observe that the parameter a measures the degree of inter- 
action in the system. For instance, for a = 0 we have 

Po^ = Q?' = (Qe(.))®'" 

and under Pq[,j the 7V-particle model consists of N independent parti- 
cles with elementary transitions K,p^jf^_^. By Sanov’s theorem, under 
the tensor product measiure = (QE(f,))*^ the laws of the empir- 
ical measures of the path particles satisfy the LDP with a good rate 
function H,, given by 



Hr,: fi€ V{il) — — Ent(^ | QE(tj)) € [0,oo] 



When the mapping satisfies the regularity conditions (10.7), the integral 
transfer lenuna (Lemma 10.4.1) applies and we readily conclude that the 
law of the empirical measures of the path particles under P(^ satisfies the 
LDP with a good rate function I given by 

I : He V{Q) — > /(/i) = Hpin) = Ent{fi | Qe(p)) e [0,oo] 

This ends the proof of Theorem 10.1.1. 




376 10. Large-Deviation Principles 



10.7.2 Sufficient Conditions 

We end this section with some simple and easily checked conditions on 
the pair (Gn, M„) under which the desired regularity conditions (10.7) are 
satisfied. To simplify the presentation, we only examine the situation where 
•) = ^n(n)- The McKean interpretation model (8.1) can be 
studied along the same line of argument as the one used in the end of 
Section 8.5. 

We recall that the one step mappings associated with the prediction flow 
rj„ are deflned for any /„ € Bb(E„) by the equation 

^nivH/n) = V(Gn-lM„(fn))MGn-l) 

We suppose the pair (G„, Af„) satisfles the regularity conditions (G) and 
(jVf )63tp presented in Section 3.5.2 on page 116, for some parameters €n(G) > 
0, some pair of functions (kn,a„) and some reference measiures p„. In this 
situation, we easily check that the mappings Vq introduced in (10.46) take 
the form 



Va(M,V) 




d$p(/ip-i) 

d^p(Vp-i) 



) 



a 



d^p(Vp-i) 



Similarly, we note that the mappings (10.7) are given by 

li(dup-i) logZj°\n,T)){up-i) = ( ^ln(!)j ) 

Our objective is to prove that our regularity assumptions on the pair 
(Gn, Afn) ensure that for each n > 1, ry G 7^(£?n-i), and a > 1 the mappings 

La,n(.,r?):/i— ^ I (d$„(M)/d$„(7/))“ d$„(7,) 

are bounded and continuous at each tj. We start by noting that for any p G 
V{En-i) the distributions ^n{p) and p„ are mutually absolutely continuous 
and 



exp(-o„(u„)) < ^^^(«n) = 
^Pn 



P’jGn—l 

KGn-l) 



< expo„(u„) 



This readily yields that $n(^)> A* € V{En-i), forms a collection of mutually 
absolutely continuous distributions and for any pair G V{En-i)^ we 
have 



exp(-2a„) < d^n{n)ld^n{ri) < exp(2o„) 



Under our assumptions, the desired boundedness condition is clearly met. 
Let us check that Va,n{‘,v)i o > 1, is continuous at rj- To this end, we 




10.8 Particle Density Profile Models 377 



observe that 



\La,n{fi,V) - La,n(T},V)\ 



< 

< 




- 1| d^niv) 
-1 



Using the decomposition (7.31) presented in the proof of Corollary 7.4.4, 
we find that 



d^niri) 



(Wn)-l 



g2an(tin) 



KGn-l) 

V{Gn-\) 



t^{Gn-l fen(«i^n)) _ j 
ViGn—1 ^n(*)Wn)) 



FVom our conditions on the pair (o„, Gn), now it suffices to check the con- 
tinuity of the mappings 






^ f | /i(Gn-l fcn(M«n)) 

j |T/(Gn-l fen(')Wn)) 



g2(a-l)a„(u„) 



at »/ € V{En-i). We fix n, w„ 6 En, and r;, and we let fvn be the function 
on En-i defined by 



fvni'^n—l') — G'n— l(Un— l) ^n(Wn— 1> t’n)/^(Gn— 1 fcn(*)^n)) 

We also note that 0 < /v„(«n-i) < (c^“’*^’'’‘V^n-i(G)) and 

^n{v)(dUn) = ^iGn-lkn{-,Un)) < gan(«„) p^(^dxin) 

V(^n-l) 

In this notation, we find that 

0<L;,„(m,»?) < I IM/vJ-»l(/.JI e<2“-D»n(«'n)p„(dt;„) 

Under our assumptions, the dominated convergence theorem applies. For 
any sequence weakly, we find that lim„-+oo ^a,n(Mn)>l) = 0. 



10.8 Particle Density Profile Models 

10.8.1 Introduction 

In Section 10.7, we have examined the LDP for the McKean particle mea- 
sures associated with a fairly general class of McKean interpreta- 
tions of the measme-valued process (10.1). The proof of Theorem 10.1.1 
was based on an appropriate change of probability measure so that the 




378 10. Large-Deviation Principles 



law of the iV-particle model (^, . . • ,^n)i<»<JV consists of a regular Laplace 
distribution on path product spaces {Eq x ... x En)^. This strategy is 
therefore restricted to regular McKean models such that Kn+i,ft{un , .) ~ 
.) for any pair of measures € V{En) and therefore does 
not apply to complete genealogical tree models. To remove this condition, 
we shall be dealing here with the flow of particle density profiles (iJp)o<p<n 
associated with the simple McKean interpretation 

of the nonlinear model (10.1). Before entering into some details about the 
proof of Theorem 10.1.2, it is useful to examine some direct consequences 
of Theorem 10.1.1. We recall that the flow {r]^)o<p<n and the particle 

McKean measure K^fjo ” ^ IZHi connected by the formula 

= «)0<p<n (10.47) 

where is the continuous mapping that associates with a given measure 
on the path space JE[o n) = rip=o marginals (see 

( 10 . 6 )). 

Definition 10.8.1 For any n > 0 and N > I, we denote by the law 
of the N -particle density profiles (»?^)o<p<n on V^{E). 

The following corollary of the LDP on path space presented in Theorem 
10.1.1 allows us to identify the candidate rate function that governs the 
LDP for the particle density profiles. 

Corollary 10.8.1 In the context of the statement of Theorem 10.1.1, for 
any n € N, satisfy the LDP in V^{E) equipped with the product weak 
topology and with the good rate function on V^{E) given by 

n 

«4i((/^p)o<p<n) ~ Ent(/Xp I $p(/ip_i)) (10.48) 

p=0 

Proof: 

Recalling that the LDP is preserved under continuous mappings (see Theo- 
rem 10.2.1, in Section 10.2), we deduce from (10.47) that E„(K^,jij) satisfies 
the LDP on P(P”(jB)) (equipped with the weak topology) with the good 
rate function defined by 

'^n{(Pp)p<n) 

= inf {Ent(/i | Qe„(m)) : M € P(%„]) s.t. E„(/i) = (/Xp)p<„) 

To prove (10.48), we first note that 

S„(/Xo ® ® /X„) = {fip)o<p<n 

Q(Mp)o<p<n = %®^l(Aio)®..-®^n(/in-l) 




10.8 Particle Density Profile Models 379 



from which we find that 

n 

Ent(^ ® ® Mn I Q(/ip)o<p<n) ~ Ent(^p | $p(/Xp_x)) 

p=0 

and Jniifip)o<p<n) < Ep=o®“*(A‘p I ^p(Mp-i))- To prove the reverse in- 
equality, we recall that for any fx € 7^(E[o,n]) we have 

Ent(/i I Qe„(p)) = sup (/i(K,)-logQE„(p)(e^")) 

VnGC6(i5(o,nl) 

n 

> sup V(Mp(vp)-log$p(Mp-i)(e’''’)] 

Vo^Cb{Eo)^...yVn^Ci){En) p_Q 

n 

= Ent(/ip I $p(/ip_i)) 

p=0 

The end of the proof of the corollary is now clear. ■ 



10.8.2 Strong Large-Deviation Principles 

The proof of Theorem 10.1.2 is based on a projective limit interpretation 
of the product r-topology in the spirit of the proof of Sanov’s Theorem 
presented in Section 10.6. All the notation and most results presented in 
this section will be used in the forthcoming development. We encourage 
the reader to make a brief visit to this section before entering into more 
details. 

Since we shall be working with nonhomogeneous Hausdorff topological 
spaces {En,Sn)> we will use the subscript (.)n to denote the corresponding 
objects. We denote by P(E„) the set of additive set functions from into 
[0, 1] equipped with the ri-topology of setwise convergence. We also equip 
the Cartesian product P"(E) = rip=o ^i^p) with the product topology. 
We notice that the product r-topology on T^{E) coincides with the relative 
topology induced on V{En) by ri-topology on P(E„). It is also convenient 
to recall that P{En) is homeomorphic to a subset of the algebraic dual 
Bb{En)* of Bb{En) equipped with the Bj,(E„)-topology (see for instance 
Theorem C3 on p. 315 in [112]). 

Without further mention, we shall assiune that 1) P(E„) is furnished 
with a (7-algebra that contains the Borel (7-field associated with the ti- 
topology on P(En) and 2) the mappings are continuous from P(E„_i) 
into P(E„) with ^n{V{En-i)) C V{E„). FVom previous observations, it is 
easy to check that this continuity condition is met for Feymnan-Kac models 
as soon as the potential functions are strictly positive. 3) The regularity 
condition (H) stated on page 338 is met. 




380 10. Large-Deviation Principles 



Our immediate objective is to provide a projective limit interpretation 
of P**(£). This program is achieved hereafter using topological arguments 
similar to the ones we used in Section 10.6. We let Un be the set of all finite 
and Borel partitions Un of £?„, and we associate with each U„ € Un the a- 
algebra (r{Un) generated by the partition Un- We recall that a partition Un 
is said to be finer than another partition Ki € Z4i, and we write Un>VnBS 
soon as <r(l7„) D aiYn)- We also equip the Cartesian product W" = 
with the partial ordering defined by 

{Up)o<p<n < (i^)o<p<n <=> V0<p<n Up<Vp 

As we did for the directed set (W„, >), it is not difficult to prove that (W", >) 
is again a directed set. We slightly abuse the notation and for any Un € Un 
we denote by B(,{Un) the set of bounded and <T(t/„)-measurable functions 
on En and by V{Un) the set of probability measures on {En,a{Un))- We 
equip the set P(I7„), t/„ = {Ui^)i<i<d„ € Un, with the metric Kolmogorov- 
Smimov metric du^ (see Definition 10.6.1). 

Whenever Un > V"„, we recall that we have V{Un) = 9i/n(P(^n)) and 
T{Vn) = PUn,v„{'P{Un)) with the corresponding continuous projection op>- 
erators qu„ and 

By Proposition 10.6.1, we know that Vn = {{'PiUn),duJ,pu„,Vn)un>v„ 
is a projective inverse spectrum of Un- In addition, the mapping /i„ that 
associates with a point = (/*n")i/new„ G Yivou^Vn the set function 
hn{Pn) € P{En) defined for any by 

h„{nn)(An) = (10.49) 

for some Un G W„ with A„ G (r{Un) is a homeomorphism between the 
compact spaces limi/„ and P(En)- The projective limit interpretation 
of the product ri-topology on P”(E) is notationally more consuming but 
can be conducted in a similar fashion. 

We associate with each U” = (Up)o<p<n G U" the Cartesian product 

r"(u) = flP(Up) 

p=0 

equipped with the product topology inherited by the metric spaces P(Un)- 
For any £/” > V”, we denote respectively by qyn and pu^.v the continuous 
and canonical projections from P"(f^), and respectively from P"(U), into 
P"(V) and defined for any p = (Pp)o<p<n € P”(£?) and i/ = (i^p)o<p<n € 
P"(U) by the formulae 

qv"(p) = (9Vp(Pp))o<p<n and Pi/",v'’*(>') = (Pi/p,Vp(^'p))o<p<n 

By construction, the set P" = {P'^{U),pu",v^)u^>v<' forms a projective 
inverse spectrum of W” with compact metric spaces P'^{U) and connecting 




10.8 Particle Density Profile Models 381 



maps pu^yn. In addition, it is not difficult to check that the change of 
coordinate mappings 

ir„ : ^ = (/io®, . . . , Hn’')uo,...,Un = (ir®(/i), . . . , 7r"(/i)) 

(10.50) 

is a homeomorphism between limi/n P” and rip=o 'Pp- Recalling that 

rip=o Pp — rip=o P(^p)> topological machinery 

we need to easily prove the following. 

Proposition 10.8.1 For any n>0 the mapping 

h" : /i € ^ h"(/i) = (ho(7r°(M)), . . . , hn(Km € P^{E) 

defined in terms of the collection of maps (hp,nfi), p < n, introduced in 
( 10 . 49 ) and (10.50) is a homeomorphism. 

Having established that P"(£) equipped with the product Ti-topology is 
homeomorphic to the compact space limt/n P”, and before getting into more 
serious business on large deviations, we now define and analyze more pre- 
cisely the candidate rate function presented in (10.48) in Corollary 10.8.1. 

Definition 10.8.2 For any n > 0, we denote by Jn the l.s.c. function 
defined by 



(mp)p<„ e P^{E) Up) = Y,Hp{pp\4>p{ppU) 

p=0 

with the convention = f}o for p = 0 and the relative entropy criterion 
Hp on {P{Ep) X P(J5p)) defined by 

HpiA’)' (/^>*^) € P{Ep)xP{Ep) — > Hp{fi\i/) = sup Enti; (/i|i/) G [0,ooJ 

l/p€Wp 

Definition 10.8.3 We associate with each the image function 

Jl( = Jnoqul:PyU)^{0,oo] 

of Jn with respect to qv^ defined for any p G P^{U) by the formula 

J^(/i) = inf {J„(j/) : 1 / G P”(E) such that gi/n(i/) = ^} 

The I.S.C. property of J„ comes from the fact that it can be obtained as 
the supremum of r-continuous functions 

n 

J„(/i)= sup sup y'\pp{fp)-log%{fip.i){e^”)] 
u^eu’^ 

where we have used the notation B^{U) = rip=o ^b(Up) and the traditional 
convention $p(/ip_i) = % for p = 0. 




382 



10. Large-Deviation Principles 



Lemma 10.8.1 For any n >0 , the domain Dj^ = {J„ < oo} of J„ is 
included in V'^{E), and for any /x = (/Xp)p<n € P"(E) and G W" we 
have 

n 

JnjfJ') — Ent(^p|$p(^p-i)) (10.51) 

p=0 

and 



JniQU’'{fJ')) = J2 Entt/^(gt/^(/Xp)|$p(i/)) (10.52) 

P=0 ({9y,_i (Mp-i)}) 

Proof: 

We prove the lemma by induction on the time parameter. For n = 0, the 
result is a direct consequence of Lemma 10.6.2. Suppose we have proved 
the lemma at rank (n - 1). Since for any /x = {Hp)p<n we have 

JnifJ') — Jn—l{itl'p)p<n) "I" •ffn(Mn|^n(Mn-l)) 
by the induction hypothesis we find that 

•^n(M) '' '' {f^p)p<n € ^ (E) and //n(Mn|^n(/^n— l)) ^ OO 

Invoking again Lemma 10.6.2, we conclude that 

Jnifi) (Mp) p<n ^ 'P {^) fMfi G P[Efi) 

The proof of (10.51) is now a straightforward consequence of (10.40), page 368. 
We prove (10.52) using the same line of argument as the one used on 
page 371. This ends the proof of the lemma. ■ 

The end of this section is essentially devoted to the proof of Theorem 10.1.2. 
Before getting into it, it is convenient to make some remarks. 

By Lemma 10.8.1, we first observe that the LDP for the ti - topology 
on P’'(^?) implies the the LDP for the r-topology on V^{E) and hence 
for the weak topology with always the same rate function. As we noted 
earlier (see p.341), in the case of Feynman-Kac models on Polish spaces, 
the exponential tightness (for the weak topology) of the laws of (r?^)p<n 
(proved at the end of Section 10.3) combined with the LDP lower bound (for 
the weak topology) shows that the rate function Jn has compact level sets 
(for the weak topology). This shows that for Feynman-Kac type particle 
models the LDP for the ri -topology on P*'(£?) with (good) rate function Jn 
implies the LDP for the weak topology on V^{E) with good rate function 
Jn. 

After this brief digression, we now come to the following proof. 




10.8 Particle Density Profile Models 383 



Proof of Theorem 10.1.2: 

To prepare the proof of the LDP upper bound, we first recall that any 
closed F of P”(E) has the form 

F = 

where Fyn stands for the ti - closure of the set qyn(F). Since P"(F) is 
compact, the sets gi;i(Fyn) are compact and for any e > 0 one can find a 
finite and open e-covering 

9yi(Fyn)cU-iVr(MNe) 

with fi' = (4)p<„ € F and V^"(p‘,e) = (Ilo<,<p'^^^’'(K»£)) 
p<n. Under our continuity assiunptions, we can choose these open neigh- 
borhoods such that for any rjp € V^’*(/ip,e) 

du„{rip,fJ'p) V dyp+i($p+i(T?p),$p+i(Mp)) < £ 

In this case, we have for each [fp)p<n G B^{U) 

P(«)p<nevr(M‘,e)) 

< eeNc(/„)-7v«(/„)-iog*„(pLi)(«''")) P((,7^)p<„ 6 V^l^{n\e)) 

for some finite constant c(/„) < oo whose values only depend on the supre- 
mmn norm of /„. A simple induction now yields that 

P(«)p<n € Vr(M’,£)) < 

from which we conclude that 

limsupj^o limsup;^^^ logP((T/^)p<„ € V^^C/rSe)) 

^ - Ep<n(Mp(/p) - log$p(/i‘_i)(e^'’)) 

Taking the infimum over all (/p)p<n € B^[U) and by (10.52), we conclude 
that 

limsupg_olimsup^_,«, ;ilogP((T)jJ')p<„ € V^f"(^‘,e)) 

< -Ep<„Enty„(/x‘|log$p(/x*_i)) < -J„^(gyn(/z‘)) < -J„^(Fyn) 




384 10. Large-Deviation Principles 



By the union of event bounds, we find that 

limsup-^logP((j?^)p<„ € < -J^(Fyn) 

N-K30 ^ 

and therefore 

limsup-^logP((»j^)p<„ € F) < -jj({Fu’>) < 

N-t-oo ^ 

We end the proof of the LDP upper bound by taking the infimum over 
all 17”, invoking the min-max theorem (Theorem 10.5.2) and recalling that 
J„(FnF”(F)) = J„(F) (see Lemma 10.8.1). 

Our final objective is to prove that the rate function J„ governs the 
LDP lower bounds on P”(F). We use an inductive proof with respect to 
the time parameter. Note that for n = 0 the desired LDP results firom 
Sanov’s theorem (Theorem 10.6.1). Suppose the desired LDP is proved at 
time (n - 1). Let i4 C P"(F) be a ri-open set such that Jn{A) < oo 
(otherwise the proof of the lower bound is as usually trivial). Invoking 
Lemma 10.8.1, for any 5 > 0 there exists a point fiE An F”(F) such that 

Jn{t^)<JniA) + S (10.53) 

Since A is open, we can find a collection of strictly positive numbers 
(£p)o<p<n and a sequence of finite partitions Uk = (t^p)i<i<dp of the sets 
Ep such that 

C(/^)= n Bu,iHp,ep)cA 

0<p<n 

with the open neighborhood Bup{np,ep) of fip G V{Ep) given by 

Bu„{fip,£p) = {t'p€P{Ep) : du^{fip,Up)<ep} 

Up to a change of index we suppose that Hn{Un) = vf=i/in(l^A)(> 0), and 
we associate with ^ € F(F„) the N-approximation distributions € 
ViUn) defined by 

I it < = <i. 

By construction we note that 

« 9t/„(Mn) « 9t^n(^n(Mn-i)) and Entu„(/x^ |$„(/i„_i)) < oo 

Under our assumptions, we also have for any i/„_i G Fu„_,(Mn-ii£n-i) 
and 1 < i < d„ 

/X^(U‘)>0=^$„(»'n-l)(U')>0 




10.8 Particle Density Profile Models 385 



Since < {cU - l)/N, we find that e Bu^{nn,en) as soon 

as N > = {d^ - l)/£n and hence 

C-i(m) ^ i^n e V{En) : = 0} C ^ 

By the definition of the iV-particle model and by Lemma 10.6.3, it is also 
clear that 

nduAfiH,v!^) = o\v!!-x) 

> exp [-AT Enti/„ \ ^n{Vn-i)) ~ dlog {N + 1)] 

Under our assumptions, we also have for any i/n-i G V{En-i) 

Pn QUni^n) ^ QUni^ni^n-l)) << QUni^n) 

and from previous observations 

f^n < qUni^nifin-l)) < QuA>^n) 

By Lemma 10.6.1, we obtain the uniform Lipschitz estimate 
|Entt/(/ijJ^|$„(i/„_i)) - Ent[/(/i^|$„(/x„_i))| 

— p„X^(U„) du„{^ni^n-l)1^n{^^'n-l)) 

with the positive constant = ^i:x„(ui)>o^n{U^) > 0. Since the 

mapping is Ti-continuous for every (5 > 0 there exists some Ti-open 
neighborhood C?i,n (Mn-i) C P(E„_i) of /i„_i (which depends on f/„) such 
that 

Os,n{f^n—l) C Sn_l) 

and on the set of events € C>i,n(/in-i)} we have 

< EntvM\Kifi.-i)) + S{<oo) 

The continuity of the function Entu„(.| on the set of measures 

{vn^ViUn) : i/„ < 9 t/„(^n(/in-i))} now implies that 

Enti/„(/i^ I < Entt;„(/in I $n(Mn-i)) + <5 

< I + 6 

for any N > and some iV^ > 1. It is now convenient to observe that 

Q^(A) = niv^,--,v!!)eA) 

> P{duAf^n,v!!) = 0 , 77^1 e OsA^^n-l) 

and iVk)p<n-i e C^„_ 2 (/i)) 




386 10. Large-Deviation Principles 



Using a simple calculation, we deduce the lower bound 

£ )!’(%. ((‘n.O = 0 I = k) OjLi (^(M.nW) 

with the Ti-open neighborhood 'D^s,e),n{f*) ^ of (Mp)p<n-i defined 

by 

From previous estimations, we find that for any N > {N^ A JV*) 

> - I ^n(Mn-l)) ~ - 2S + \ogQi!_, {V^s,e)M) 

= -Jnifi) - _ 25 + J„_i((/ip)p<„) + i logQ^i (%0.n(M)) 

Furthermore, by (10.53) and our induction hypothesis, we have 
liminfAr_>oo j/ \ogQ^ (^4) 

> -UA) - 3J + [J„-l((/x)p<n) - J„-i(%e),n(M))l > ~ ZS 



for any 5 > 0. Letting 6 tend to 0, we conclude that J„ governs the LDP 
lower bound on P”(J5). This ends the proof of Theorem 10.1.2. ■ 




11 

Feynman-Kac and Interacting Particle 
Recipes 



11.1 Introduction 

This chapter offers a series of Feynman-Kac and interacting particle mod- 
eling recipes that can be combined with one another and applied to every 
application discussed in this book. We have chosen to present this catalog 
of Feynman-Kac techniques for several reasons. 

First of all, as soon as a particular estimation problem has been iden- 
tified as that of solving a Feynman-Kac formula of the form (1.3), the 
particle-approximation models are dictated by the pair potentials/kemels 
(Gn,M„). In this sense, this section provides a unifying description of a 
class of seemingly different particle algorithms currently used in applied 
literature. 

The second reason is that some of these techniques, such as the branch- 
ing strategies, the importance sampling changes of measures, the changes 
of potential functions, and the multilevel decompositions of the state, can 
be used in practice to increase the eflSciency of Monte Carlo particle algo- 
rithms. 

Finally, the complete mathematical analysis and the precise numerical 
comparisons between these particle techniques have only been started, and 
various open problems remain to be solved. The forthcoming descriptions 
together with the mathematical asymptotic analysis described in the final 
chapters of this book not only suggest several avenues of research but we 
hope will influence the reader to design new ideas in the future. 




388 11. Feynman-Kac and Interacting Particle Recipes 



This chapter is organized as follows. In Section 11.2, we briefly review the 
interacting Metropolis model introduced in Section 5.4. We illustrate the 
impact of this particle MCMC algorithm in the context of the popular Ising 
ferromagnetic spin model. For applications to Bayesian spectral analysis, 
we refer the reader to [71]. 

In the next two sections, 11.3 and 11.4, we provide a brief discussion on 
Feynman-Kac path measiures and their interacting particle and genealogical 
tree interpretations. 

The next three sections, 11.5, 11.6, and 11.7, present some essential and 
additional tools such as conditional exploration techniques and excursion- 
valued particle models. Some of these techniques are not new. For instance, 
the branching excursion models presented in Section 11.7 offer very accu- 
rate stochastic and adaptive grid approximations. These ideas were applied 
originally as a natural heuristic scheme in tracking and global positioning 
systems in [43, 105, 103, 106, 107]. They were also analyzed rigorously by 
the author in the article [76] published in 1998. More recently, the same 
branching excursion models have been applied with success to generate self- 
avoiding random walks by Fauenkron, Causo, and Grassberger in [149] and 
to related protein-folding problems by Liu and Zhang [235] under the botan- 
ical names “Markovian anticipation”, “lookahead strategies” and “sampling 
importance sampling pilot exploration resampling”. 

In Section 11.8, we design a new class of branching particle interpreta- 
tion models. Our reasons to include this material are two folds. First we 
believe that the readers may benefit to have a friendly presentation on some 
easy simulation algorithms that can be directly implemented on a personal 
computer. The second reason is that the mathematical analysis of some 
of the branching models presented hereafter differs from the one discussed 
in this book. Thus, this presentation also suggests new research avenues. 
In this connection, we mention that the literature on genetic algorithms 
abounds with various classes of selection generation strategies (see for in- 
stance [22, 310] and references therein). More recently in particle filters 
literature, there is an interest to find the “most efficient” selection pro- 
cedure. The branching rules are often presented as intuitive but heuristic 
schemes with no precise asymptotic results. 

In Section 11.8.2 we provide a pair of practical and easy to use local 
L 2 -conditions on the local branching rule that ensures convergence to the 
desired Feynman-Kac model. In Section 11.8.3, we illustrate these condi- 
tions with Poisson, Bernoulli and other branching mechanisms such as the 
Baker remainder stochastic sampling. The latter procedure was introduced 
by J. Baker in [21, 22] in 1985 to reduce the computational efforts of genetic 
type models. It was rencently rediscovered and applied with success by Liu 
and Chen in [228] in 1998. The first well founded asymptotic analysis and 
related continuous time branching schemes with random populations sized 
can also be found in [67, 96, 97]. 




11.2 Interacting Metropolis Models 389 



In Section 11.8.4, we derive an new uniform convergence theorem which is 
valid for any branching model with fixed population size. In Section 11.8.5, 
we also show that there is apparently no hope to obtain a uniform estimate 
for branching models with random population size. We hope that these 
rather elementary results will help the reader in developing new and more 
sophisticated asymptotic results such as propagation-of-chaos properties, 
fluctuations theorems, large-deviation principles, and related concentration 
results. 



11.2 Interacting Metropolis Models 



11.2.1 Introduction 



One strategy to generate approximate samples according to a given distri- 
bution 7T is to interpret the target distribution ir as the limiting distribution 
of a judicious Feynman-Kac distribution flow. This modeling technique is 
presented in full detail in Section 5.4. In the present section, we merely 
content ourselves in briefly reviewing this Metropolis-Feynman-Kac model. 
We also illustrate some consequences of the results developed in earUer sec- 
tions in the context of the Ising model. In this physical ferromagnetic spin 
model, atoms sit at the nodes of sites s in the d-dimensional cubic lattice 
5 = [-M, M]^. At each site s, the atom has a positive or negative spin 
y{s) € {-1-1, -1}, and each pair of atoms y{s) and y{s') has an interaction 
energy d*,*'j/(s)y(s') that, in addition to the interaction strength param- 
eter Ja^g>, depends on the distance between the pair of sites (s,s'). The 
potential or the energy function of a configuration i is often defined by the 
formula 

aeSs'eVs a€S 

where for any s e S, Vg = {s' 6 S : Vp<a \sp - Sp| < 1}. One typical 
question arising in practice is not only to find the extremal energy config- 
urations but also to generate random samples with the Boltzmann-Gibbs 
distributions on E = {~1, +1}^ defined by 

Hdy) = u{dy) (11.1) 



with the uniform distribution i/. In this situation, we see that the interaction 
function between atoms is not related to some temporal interpretation of 
the state but to a particular neighborhood structure. On the other hand, 
the uniform distribution 1 / can also be regarded as the limiting probability 
of some Markov exploration chain on the state of all configurations. We 
can choose for instance 



K{y,y') 



1 

|V(y)| 



lv ( v )(!/0 



( 11 . 2 ) 




390 11. Peynman-Kac and Interacting Particle Recipes 



where | V(y)| stands for the cardinality of the neighborhood subset of y given 
by V{y) = \yf eE : ly(«)(!/(s)) ^ !}• % symmetry arguments, 

we readily find that K is i/-reversible; that is, v{y')K{j/,y) = v{y)K{y,y'). 
We mention that these Boltzmann-Gibbs distributions are often used as 
“prior models” in Bayesian image analysis [313] and pattern theory [168]. 



11.2.2 Feynman-Kac-Metropolis and Particle Models 

We start with a pair of Markov kernels {K, L) such that 



(tt X L)2 « (tt X K)i (11.3) 

with the distributions on the transition space (E x E) defined by 

{it X L) 2 {d{y,y')) = ir{y')L{y',dy) 

{ir X K)i{d{y,y')) = r{y)K{y,dy') 



Then we associate with the pair (/f, L) the Metropolis potential ratio on 
{E X E) given by 

G = d{irx L) 2 /d(ir x K)i 

In the case of the Ising model (11.1) with K = L and given by (11.2), we 
find that for any y' € V{y) 

G{y,y') = exp{-{H{y')-H{y))} 

= exp I (y(s) - y'(s)) J,,s'y{s') 

I V »'6V. 




where s stands for the only possible site where y' and y may differ. The main 
interest in introducing the Metropolis potential ratio above comes from the 
following key Feynman-Kac formula (see Theorem 5.5.1, Section 5.5) 



E^iU{Yr,,...,Yo)\Yn = y) 



Ef(/n(ro,...,yn) 



(11.4) 

We recall that stands for the expectation operator with respect to the 
law of a (canonical) Markov chain with initial distribution tt and elemen- 
tary transitions L. In the same way, Ey is the expectation operator with 
respect to the law of a (canonical) Markov chain starting at y and with 
elementary transitions K. Then we observe that under the random se- 
quence of transitions defined by the pairs X„ = (V^,, Yn+l) forms a Markov 
chain with initial distribution ffo = {SyXK)i and its elementary transitions 
are given by 



M^iiy,y%d{z,z')) = 6y>{dz) K{z,dz') 




11.2 Interacting Metropolis Models 391 



In this notation, we readily find the formula 



E^(vp(yi,yo)|yn+i = y) 



E^(vp(yn,yn+i) iip=oG(i;,yp+i)) 
E5^(rip=oG(yp,yp+i)) 
n;=oG(Xp)) 
Ef(np=oG(Xp)) 

f}n{(p) (11.5) 



where tin represents the updated Feynman-Kac model associated with the 
pair potential/kernel (G, M^). 

Before getting further into our discussion, by (11.5) we observe that in 
some sense and as n oo we have fjn — > (tt x L )2 as soon as the Markov 
kernel L is sufficiently mixing. For more details, we refer the reader to 
Section 5.5.4. To illustrate this assertion, we return to the Ising model 
example. In this situation, we note that any pair of configurations (y, y') 6 
E can be joined by a /^-admissible path of length m < m(d) = (2M+1)^ = 
15|. More precisely, there always exists a path yo.yi, • • • ,y»n G E such 
that yo = y, Vi € V(yi+i) for alH < m and y^ = y'- Since we have 
|V(y)| = (1 + 15|), for each y € E, this yields the rather crude lower bound 



e{d) = 



inf 



v.y'.y" 



d/r”*(<')(y, .) 

(y', .) 



(y") > (1 + 



We conclude that < (1 - e(d)), where represents the 

Dobrushin ergodic coefficient of (see for insttince (4.12) on page 128). 

Recalling that if = L, we conclude that the mixing condition (L)m intro- 
duced on page 182 is met with m = m{d) and e(L) = e{d). Thus, by 
Theorem 5.5.2, we conclude that 

l|ym(rf)+n+l - (TT X L)2||tv < 2s(d)-^ (1 - s(d))^"/"*(<'>J (11.6) 

In Section 5.5, we have seen that the distribution flow fjn can be inter- 
preted as the evolution of the laws of a nonlinear (or nonhomogeneous) 
Feynman-Kac-Metropolis model. The display above illustrates the main 
property of this class of models, namely that their decay rates to the de- 
sired target distribution do not depend on the nature of the limiting dis- 
tribution. The final step is now clear. The particle simulation algorithm 
coincides with the particle interpretation of the Feynman-Kac models de- 
scribed above. For the convenience of the reader, we briefly describe one 
rather generic type of J5^-valued particle model: 



exploration 

(yn)l<»<W > (^n )l<«<N 



selection 

^ (yn+l)l<t<yv 



Initially, we start for instance with N particles in the same location Yq = y. 
During the exploration stage, each particle evolves to a new location 




392 11. Peynman-Kac and Interacting Particle Recipes 



randomly chosen with the distribution K{Y^, .). During the selection 
stage, we sample randomly N random variables accep- 

tance/rejection rules 

, _/r« withproba 

withproba 

where (2^)i<»<jv represents a sequence of conditionally independent ran- 
dom variables with the discrete distribution 2j<=i ^A’vlv'h ^Y'*- 

practice, the initial choice Vq = p is of course far from being optimal if the 
only objective is to sample according to the target distribution n. Never- 
theless, in this case the resulting genealogical tree occupation measure 




i=l 



<YIJ 



is an N-particle approximation of the distribution 

7r„,„ = iP^((y„,...,yo)€. |rn = y) 

For more details, we again refer the reader to Section 5.5. Using simple 
manipulations, we deduce that for any x, x\ x" €{E x E) we have 



j(A//f)m(d)+l(x, .) 



(x")>e(d) and G(x) > G(x') 



In other words, the time homogeneous pair (G, M^) satisfies the regularity 
conditions (G) and (M)m introduced in Section 3.5.2 with m = m(d) + 1, 
e(M) > e(d) and c(G) > These elementary observations allows to 

apply most of the as}rmptotic results presented in Chapter 4 and Chapters 7 
to 10 to analyze the asymptotic behavior of these Feynman-Kac models 
and their particle interpretations as the time parameter or as the size of 
the system tends to infinity. 

For instemce, in the case of the Ising model described above and as a 
direct consequence of (11.6) and (7.26) in Theorem 7.4.4, we find that for 
any / € Osc\{E) 



E 



i=l 



i/p 

< ^-|-2£(d)-‘ (l-£(d))l'*/"*(‘')J 

y/N 



with 

6(p) = c m{d) exp {(2m(d) + 1) osc{H)} 

Similarly, for any nondecreasing pair sequences {q{N),n{N)) such that 
n{N)q^{N) = o{N), we have Iqr Corollary 8.9.2 



limsup 

N -¥00 



^ upjNMN)) 

n{N)q^Ny v,n(N) 



< c e{d) 



-2 



g2m(d) 06C(H) 




11.2 Interacting Metropolis Models 393 



where represents the distribution of the first 9 -path ancestral lines 
from the origin up to time n. 

11.2.3 Interacting Metropolis and Gibbs Samplers 

The objective of this section is to better connect the interacting Metropolis 
model discussed above with the more traditional Gibbs sampler. More pre- 
cisely, we show that the iV-particle model with “simple” Gibbs mutation 
transitions coincides with N independent copies of the Gibbs sampler. Let 
{E,€) be some measurable space and let tt € V{E^) be the distribution 
of a d-dimensional random vector U = (U*)i<i<d- For each index i, we let 
Ui = {U^)i<j<d, j/i be the (d - l)-dimensional vector deduced from U by 
simply deleting the ith coordinate and let TTj be its meu'ginal distribution. 
We further require that the distribution n can be disintegrated with respect 
to the distribution 7 Tj and we have the formula 

n{d{v}, . . .,u^)) = irj(duj) 7r’(uj,du') 

for some Markov transition tt’ from E*^~^ into E, where du< stands for 
an infinitesimal of the point u< = {u^)i<j<d, j^i- We finally introduce the 
Markov transition ki on E^ defined by 

fci(/)(«)= / Tr\ui,dv*)f{0i{u,v*)) 

Je 

for any (/,«) € {Bt,{E^) x E*^) find where 0i(u,v') £ E^ is the d-vector 
defined for any 1 < fc < d by 

= u’ lk=i + u* lfc 5 ^< 

By construction, we clearly have that (tt x fcj)i = (tt x ki )2 for any 1 < i < d; 
in other words, each k{ is reversible with respect to it. This only indicates 
that each is a candidate to construct a Markov chain Monte Carlo algo- 
rithm with limiting measure tt. As the reader has certainly noticed, these 
kernels are usually degenerate, and the resulting chain will behave very 
poorly. The simple Gibbs sampler is a homogeneous Markov chain y„ with 
an elementary transition of the form {ki...kd)'^ for some m > 1 . For 
m = 1 , the random transition consists in changing successively each coor- 
dinate of the initial state. For m > 1, we repeat this mechanism m times. 
To take the final step, we notice that (tt x if)i = (tt x L )2 for any pair of 
transitions {K, L) of the form 

K = {k\ . . . kd)”' andL = {kdkd-i . . . ki)"* 

for some m > 1. We now leave the reader to check that the N interact- 
ing Metropolis model associated with the pair (K,L) coincides with N 
independent samples of the Gibbs algorithm. 




394 11. Feynman-Kac and Interacting Particle Recipes 



11.3 An Overview of some General Principles 

In earlier sections, we have developed a particle methodology for solving 
numerically Feynman-Kac distributions of the form 

Q„(d(io,...,x„)) = i I J]G,(x,) 1 P„(d(xo,...,x„)) (11.7) 

■*n (.,=0 J 

where G„ is a sequence of nonnegative potential functions on some mea- 
surable spaces (En,Sn) and is the distribution defined by 



P„(d(xo, . . . ,x„)) = T)o{dxo)Mi{xo,dxi ) . . .M„(x„_i,dx„) 

where Mn is a sequence of Markov transitions firom En-i into E„ and r)o 
some initial distribution on Eq. The particle interpretation of these mea- 
sures is not unique. It can be defined alternatively from the dynamical 
equation of the nth time marginal ffn or from one of the prediction flows 
T)n given for any /„ € Bb{En) by the equation 



/ n-l 

»?n(/n)=7n(/n)/7n(l) with 7n(/n) = E /„(X„) J] Gp(Xp) 

\ P=o 

In Section 2.4.3, we have seen that the updated model can also be in- 
terpreted as a prediction jaodel associated with the pair of updated po- 
tentials/transitions (Gn.Mn) defined in (2.13) and (2.14). In this sense, 
the analysis of the prediction model provides a imifying treatment of both 
situations. Its evolution equation has the form 



where Kn^rjy is a nonunique sequence of Markov transitions. The particle 
interpretation model associated with a given choice of transitions consists 
of a sequence of Markov chains on the product spaces with elementary 
transitions 

N 

Proba(Cn G d{xl„ ...,x^)\ ^n-l) = n^"+bm(€„)(C-l.<^n) 

i=l 

Under appropriate regularity conditions on the collection the par- 
ticle occupation measures ^ converge in some sense and 

as iV “4 00 to the desired distribution rjn- In Section 2.5.3, we have pre- 
sented at least six different choices of possible transitions. Given a pair of 
potentials/transitions (Gn, Mi), we can take for instance 



Kn-|-l,Tyn “ ‘S'n,T7nMl+l 



( 11 . 8 ) 




11.3 An Overview of some General Principles 395 



where Sn.rjy is the collection of selection transitions given by the formula 



‘^n,r7n(^ni •) ryn-ess-SUp^(Gn) ryn-ess-SUp^(Gn) ) 



For (0, l)-valued potential functions Gn, we can alternatively choose 



(11.9) 



•) Gn{^n) ^Xn “I" ^n(^n)) ^n{Vn) 



( 11 . 10 ) 



We recall that ’fn is the Boltzmann-Gibbs transformation associated with 
the potential function Gn and defined by 

^n(^n)(dXn) = T7T~\ ^n(^n) Vn{dXn) 

Vn\^n) 

Loosely speaking, one advantage in choosing the first selection transition 
is that it increases the proportion of particles that do not interact. The 
interacting particle model associated with the first McKean interpretation 
model is defined in terms of a two-step mutation/selection transition 



Sn,m(^n) ^ A^n+1 

sn ^ Sn ^ Cn-f-l 



(11.11) 



At time n = 0, the system consists of N iid random particles with common 
distribution tjq- During the selection stage, each particle selects a new 
location with the discrete distribution 






GniO c , f 1 GniC) 
vf^,G„{d) V ^f=iGnid) 






During the mutation stage, we evolve randomly each selected particle ac- 
cording to the Markov transition Mn+i. An important and distinctive fea- 
ture of these particle approximation models is that they can also be used 
to estimate recursively in time the partition functions 7 n(l) as well as the 
Feynman-Kac path measures (11.7). The last assertion will be discussed in 
some detail in Section 11.4. The central idea to approximate the unnormal- 
ized distributions 7n is to use the easily derived product formula 



n-l 

7 n(/n) = Vnifn) fj Vp{Gp) ( 11 . 12 ) 

p =0 

Mimicking this key representation, we construct a natural unbiased esti- 
mate simply by setting 

7^"(/n)=T?iT(/n) n<(Gp) 

p =0 

For more details on the asymptotic behavior of these approximation mea- 
sures, we refer the reader to Chapters 7 and 9. 




396 11. Feynman-Kac and Interacting Particle Recipes 

11.4 Descendant and Ancestral Genealogies 



The evolutionary particle model described at the end of Section 11.3 has a 
natural birth and death interpretation. We refer the reader to Chapter 3. 
Tracing back in time the genealogy of each current individual we define 
the ancestor lines 



(a ,n» • * * ’ Cn— l,n» Cn,n) 



This genealogic^ tree structure can be used to estimate the Feynman-Kac 
path measures defined in (11.7). In some sense, as N -¥ oowe have 



1 ^ 

r;-l.n.«;.n) 

i=l 



Qn 



(11.13) 



The precise meaning of this asymptotic result is described in Chapters 7 
to 10. This tree-based particle m(^el also provides a natural way to estimate 
the conditional distributions of respect to the time horizon given 

by the following proposition. 

Proposition 11.4.1 We fix a time horizon n > 0 and we let {Yq, ¥„) 
be a random path on (fJo x . . . x £?„) with distribntion Q„. Also let Qp,„ be 
the marginal o/Q„ w.r.t. the first {p+l)-coordinates. For eachp <n, a ver- 
sion Qn|p(i<^P) d{xp+i , . . . , x„)) of the conditional distribution (l^+i, . . . , Yn) 
given {Yq, . . . , Yp) = (xq, . . . , Xp) only depends onYp = Xp and it is given 
for Qp,n-a.e. (xo, . . . , Xp) by the formula 

rr^ G (x ) 

Qn\p{Xp,d{Xp+l,. . . ,X„)) = - 9^+^ J ’’ P„|j,(Xp,d(Xp+i, . . . ,x„)) 

•^n|p(®p) 



vuith 



Pn|p(®p)^(®p+1) • • • >®n)) hlp^\{Xp,dXp^i) . . . Affi(Xn— 1, dXn) 

and the normalizing constants 2„|p(xp) = E(n^=p+i G,(X,)|Xp = Xp). 
Proof: ^ 

We first notice that the marginal of Qn with respect to the first p -f 1 
coordinates (xo, . . . , Xp) is given by 

Qp,n(d(X0, . . . , Xp)) = GgiXg) Pp(d(X0, . . . , Xp)) 

■2n ,=o 

Then we readily check that 

Qn(^(^0) • • • y^p)) “ Qp,n(^(^0) • • • y ^p)) Qn|p(^p> > • • • y ^n)) 



from which the end of the proof is straightforward. 




11.4 Descendant and Ancestral Genealogies 397 



Loosely speaking, any individual in the genealogical tree, say i^p „, can be 
regarded as the common ancestor of a particle evolution model from time 
p up to the current time n. The resulting^ tree of descendant individuals 
provides a particle approximation of the Q„-conditional distribution of a 
canonical path (Tp, . . . , T„) given Yp = 

To describe more precisely the particle approximations of the conditional 
distributions Qn|p> it is convenient to count the number of descendant indi- 
vidual of each ancestor. To define these numbers, we first fix a time horizon 
n > 0 and we let No,n be the number of ancestors at the origin 0; in other 
words, 

No,n = Card{l <io<N : 

We also let ; 1 < io < ^o,n} be the set of these initial 

^ . 

ancestors. Each of these ancestors is the parent of individuals 
Xo,n ^ Xl,n > ‘1 — • • • ) 



at level p = 1. In the same way, each of these individuals is the parent 
of individuak 



Xl,n 



X2,n > 









at level p = 2, and so on, up to the current population at level p = n. To 
clarify the presentation, we slightly abuse the notation and we write i[o,p] 
instead of We notice that a particular individual Xv^n^ 

level p has had dp,n(^[o,p]) descendants at level n and we have the backward 
inductive formula 



^p>^(^[0,p]) “ ^ ^ dp4-i,n(^[0,p]) V+l) 



with the constant terminal conditions dn,n(io,n) = 1 so that 
4-l.n(i[0.n-ll) = ^‘rn"“' 

Also notice that we have the backward integer decomposition 

No.n ^ N0,n Ni% ^ No.n N{% ^ 

N do,n{io) = 52 ^l.«(*[0,l]) = 52 52 52 ^2,n(*[0,2]) = • • • 

<0 = 1 <0 = 1 <1 = 1 <0 = 1 <1 = 1 <2 = 1 

The disintegration formula of the genealogical path-particle distribution 
(11.13) is now given by the following proposition. 




398 11. Feynman-Kac and Interacting Particle Recipes 




Proposition 11.4.2 We fix a time horizon n > 0 and we let (lo. •••, Yn) 
be a random path on {Eq x . . . x £„) with distribution Q^. For each p<n, 
a version 

OJi,(xA’'.<((Xp+i Xj) 

ofihe wndiiionat distribution . ,1.) gioenYp = is defined bp 

the empirical measure of the genealogical tree model of the ancestor , 
namely 






®^,n(*[0,p]) 



j&Mo.pl oMo.p+i] 

-'>+l,n ^p+2,n 

E E 

ip+i=l »p+2=l 



Nn' 



♦n — 1 



^/^Mo,p+i) 

v^p+l,n ’ 



..xl'V) 



In Figure 11.1, we have presented with a thick line the genealogical tree 
model associated with a given ancestor X2,V 2. This picture 

corresponds to the situation where (p,n) = (2,5), and TV = 15. In this 
situation, we note that the backward decomposition formula (11.14) reads 



^^2,5(t[o,2)) — 5 = 3 + 2 = (2 + 1) + 2 



and we have ivj"'*' = 2, (TV^ = (2, 1), and 

(j^y = (2,1,2) 



We use the same disintegration technique for the prediction Feynman- 
Kac path-measure Qn (defined as in 11.7, with taking the product of po- 
tential functions up to time (n - 1)) and its genealogical path-particle 
approximation 



1 ^ 
t=l 




11.4 Descendant and Ancestral Genealogies 399 



In the display above, (^p,„)o<p<n represents the Mcestral Une of the current 
individual „ = ^. 

We denote by (dp,n(*(o,pi))^^’i,n)Xp.V') the quantities defined as 




by replacing the updated ancestral lines „ by the predicted lines 
Definition 11.4.1 For any 0 < p < n, and any multi-index t(o,p] = 
(to, ip), with 1 < ifc < h < p, the descendant genealogical 

tree model of the pth ancestor Xp.n*’’ 6 Ep at time n is the random tree 
starting at Xp.n’’' and given by 

v‘|O.Pl V 1 k 

Xp.n * Ap+l,n * • • • * Xn,n 

The descendant genealogy of a given ancestor forms a random tree valued 
process that evolves in accordtmce with the selection/mutation transitions 
of the particle model. Figiu-e 11.2 gives a schematic pictme of the descen- 
dant genealogical tree models from time p to time n = p + 4 associated 
with a given individual x at time p. Note that any particle = i in the 
complete genealogical tree model at time p can be regarded as the current 
descendant individual of an historical process; that is, we have 




for some multi-index = (iq, . . . ,ip. When the descendant genealogy 

i* 

of X still survives at time n > p we also have Xp,*’**' = for emy k firom 
p to n, and for some multi-index ij^ pj = (i§, . . . , t*). In other words, x is 
the common ancestor of a sequence of genealogical trees. In Figure 11.2, 
we have presented the descendant tree models at time p-|-l,p-|-2, p-l-3, 
and n = p -I- 4. Note that the number of descendant individuals is not fixed 
but it depends on the evolution of the whole particle model. R:om previ- 
ous argiunents, we observe that the path-particle measmes Qj^p(x, •) pro- 
vide “an AT-approximation” of the conditional Feynman-Kac patn-measures 
Qfc|p(®) •)• denote by Q[)[]|p(®) •) and Q[fc]|p(a:, •) their marginal w.r.t. 
the time parameter k. For instance, in the time homogeneous situation we 
have for any I > 0 

Q[p+i]ip(^> •) = 

where represents the solution of the prediction Feynman-Kac flow 

starting at = 6x at time / = 0. Using the product formula (11.12) 
we have 

p+i-i p+i-i 

EpMiXp+i) n G(X,)) = Q(p+„|p(/)(x) n Qwip(^?)W 

q=p q=p 




400 11. Feymnan-Kac and Interacting Particle Recipes 




These unnormalized measures can be estimated using the x-descendant 
genealogical measures •) product formula 

p+j-i 

q=p 



11.5 Conditional Explorations 

In the algorithm presented in (11.11), the particles randomly explore the 
state space using an “a priori” mutation transition. The latter terminology 
has been chosen in reference to the filtering literature. It simply expresses 
the fact that the mutation transition does not depend on the potential 
functions. In the present section, we design a more accurate exploration 
strategy that depends on the potential functions. The key idea is simply 
based on the decomposition 

l) G^n+l(^n+l) ^ Gn{Xn) i(Xn, dXn-|-l) 



with 









1 1 ) ) 

and Gn{xn) = Afn+i(Gn-i-i)(xn). Ptom the display above, we find that 



(11.16) 



Q„(d(xo,...,x„)) = 



-n (,=o ) 



P„(d(xo,...,x„)) (11.17) 




11.5 Conditional Explorations 401 



with the path distribution Pn given by 

• • • » ^n)) “ Vo{d>Xo)Mi(^XQy dX\) , . . Mfi(Xfi—i^dXfi) 

Prom these formulae, one deduces that the updated Feynman-Kac flow % 
satisfies the equation ^ 

Vn-^l — 'Hn^n-\-lSn (11.18) 

with the collection of Markov transitions Kn+i,fi„ = 5n,^„M„+i defined 
as in (11.8) by replacing the pair (Gn, M„) by (G„, M„). The particle in- 
terpretation model associated with (11.18) is now defined as in (11.11) 
by replacing the pairs (G„,M„) by (G„,M„). More precisely, this particle 
model is again defined by a selection/mutation transition 



®n.m«„) p Af„+i 

(n ^ sn ^ sn+1 



(11.19) 



Note that during the selection stage each particle selects a new location 
with the discrete distribution 






GniCn) 

v^=A(a) 



Sci + 

Sn 



Gni^) ] 



$n(m(^„)) 



where is the Boltzmann-Gibbs transformation associated with the po- 
tential function G». During the mutation stage, we rwdomly evolve each 
selected particle according to the Markov transition M„+i. Particle mod- 
els with conditional mutations are particularly useful when the regularity 
condition is not met for the initial potential functions G„ but for the new 
reference functions M„(G„) = Gn (see Exercise 11.9.6). Nevertheless, and 
as we have already mentjoned in Chapter 3, page 100, sampling random 
transitions according to Mn+i is usually time-consuming, and we need to 
resort to an additional level of approximation. 

One natural idea is to sample the transition from a given selected 

particle, say particle approximation 



1 

^n+l) =def. ^ 



k=l 



i,k 

^n+1 



where (t^n+i)i<it<N' ^ ^ auxiliary collection of N' conditionally indepen- 
dent random variables with distribution .)• Replacing in (11.16) 

the transition Mn-i-i by its iV'-approximation, we have in some sense 



w \ XfN' \ _ ^n+l(f^n+l) r 

Mn+l($^,.) — ^n-hl(^ny) —def. 2^ jsi' ^ . j^l x 

k=l 2^1=1 ^n+ll^n-l-1/ 

1 

and G„(g) = 



k=l 




402 11. Fe 3 mman-Kac and Interacting Particle Recipes 



selected site, weight=(Wl+W2+W3+W4)/4 




FIGURE 11.3. particle conditional exploration 



In Figure 11.2, we have presented with a thick line a single approxima- 
tion conditional exploration starting from x and based on iV' = 4 explo- 
ration transitions = x ^ 1 ^ ^ < 4, with respective weights 

Gn^l{Uii,) = Wk. 

11.6 State-Space Enlargements and Path-Particle 
Models 

Let p = (pn)n>o be an increasing sequence of time parameters such that 
po = 0 and pn < Pn-i-i for all n > 0. For any s < t, we denote by 

^[syt) “ i^q)s<q<t ^ ^[Syt) “ J[ J[ 

8<q<t 

a generic excursion from time s up to time t (excluded). We associate 
with p the collection of potential functions, Markov transitions, and initial 
distributions on excursion spaces and defined by 

^n^H^[pn,Pn+i)) “ H ^q(^q) 

Pn<q<Pn+l 

l,Pn)>^[Pn.Pn+l)) ~ n Mq{Xg-l,dXg) 

Pn<9<Pn+l 



eind 



= Voidio) Mi(io,da:i)---Mp,_i(xp,_2,dXp,-i) 

Using this notation, we observe that any path from 0 up to time 
can be decomposed with respect to the time mesh p with the excursion 
decomposition 



a^[0.Pn+i) = (a;[0.p.), • • • ,a;tpn.Pn+i)) ^ %p„+.) = 4”^ X ... X 




11.6 State-Space Enlargements and Path-Particle Models 403 



where En^ = for all n > 0. Let be the updated Feynman- 

Kac distribution defined as in (11.7) by replacing the triplet {GnjMnyrjo) 
by 

Proposition 11.6.1 For any n>0, we have Qp„+,-i = Qn ^ 

Proof: 

It suffices to observe that (^g(^g) = 119=0 ^^g\^g) “ 

with 

Pff^(d(X[0,pi), • * • J^[PniPn+l))) 

= ^0^^(d^[0,Pi)) '^l^^(^[0,pi)j^[pi,P2)) • • • -^n^^(^[p«-l,Pn)j^[Pn,Pn+l)) 



Using Proi>osition 11.6.1, we extend the particle model defined in (11.11) 
into an excursion-valued particle interpretation. To be more precise, let 
be the Markov chain defined as in (11.11) by replacing the set of 
parameters (E„,Gn,M„,»^) by {En\Gn\Mn\v^^)- At time n = 0, the 
system consists of N iid exclusions 

^?^’‘ = (a,---.c;-i) 



with common distribution tj^\ During the selection stage, we randomly 
select N excursions 

S"" = 4 ......,) = (5. 

with the discrete distribution 

j(p) /f(p).» 



cKP) /fip;.» 1 






Sf(p).i + 
Sn 










In the display above '1'^^ represents the Boltzmann-Gibbs transformation 
on the excursion space En^ associated with the potential function Gn^. 
During the mutation stage, we randomly evolve each selected excursion 
according to the Markov transition 






M 



ip) 



^ Sn-f 1 “ S[p, 



n+l 



»Pn-f2) 



In other words, we randomly sample N excursions of length (pn +2 -Pn+i) 
from the final selected sites 



Wl-1 ^ ^Pn+1 ^ Sp„+i+l ^ 



M, 



Pr^-1 



c; 



•Pn-l-2-l 




404 11. Peynman-Kac and Interacting Particle Recipes 



This model and the one introduced in (11-11) are mathematically equiva- 
lent, but in practice the excursion particle model often gives more accurate 
results. For instance, in the simple filtering example discussed in the in- 
troduction of the book, the radar observations only give information on 
the distance between the sensor and the target. To estimate the speed 
and acceleration components of the signal, we clearly need to use at least 
three observations. Therefore, in terms of the excursion particle model pre- 
sented above, it is natural to update the particle exclusions using at least 
three likelihood functions. In this situation, it is more judicious to take 
(Pn+l ~ Pn) ^ 3. 



11.7 Conditional Excursion Particle Models 

We can also combine the excursion particle model described in Section 11.6 
with the conditional exploration techniques presented in Section 11.5. The 
corresponding conditional excursion particle model is defined as the one 
presented in Section 11.6 by replacing by the triplets 

defined by 

(^[Pn - 1 ,Pn ) > ^[Pn >Pn+ 1 ) ) 

_ (^(Pn.Pn-n)) 



and 



sS"’ = (11.20) 

As in Section 11.6, to sample the conditional excursion transition 

from a given selected excursion, say we can use the N'- 

particle approximation 



■) .) =M. i 



fc=l 



[Pn.Pn-f-l) 



where represents an auxiliary collection of N' condi- 
tionally independent random excursions with distribution .). 




11.8 Branching Selection Variants 405 



Replacing in the definition of the transition by its iV'- 

approximation , we have 



and ^ 






N' 



/5(p) (TTl’^ \ ^n+1 






The resulting (TV, iV')-approximation model is essentially the same as the 
one we discussed at the end of Section 11.5 except that we are using here 
an additional excursion exploration mechanism. 



11.8 Branching Selection Variants 

11.8.1 Introduction 

In this section, we design branching particle interpretations of the Feynmtm- 
Kac model {rjn, rjn) associated with a pair of potential functions and Meukov 
transitions (Gn, Afn) on some measurable state spaces (£?„, £n) (see Sec- 
tion 11.3 for a more precise definition of these distribution flows). 

The best pedagogical way to introduce these branching strategies is to 
interpret the interacting particle models presented in Section 11.3 as a 
birth and death Markov chain. Here agsdn, to simplify the presentation 
we restrict ourselves to (0, l]-valued potential functions G„ and McKean 
transitions given by 

l^n+l,Tin — 8n,tf„Mn+\ 

with the selection kernel 

Sn,r,„ (®. dy) = G„(x) Sx{dy) -I- (1 - G„(x)) ^n(»?n)(dy) (11-21) 

We use the same notations as in (1.16), and we denote by |pn| = Ya=i 9n 
the number of particles which have accepted to stay in the same location 
and we let {^i)\g^\<j<N be a collection of N„ = {N - |p„|) conditionally 
independent and identically distributed selected sites with law il’„(m(^„)). 
After the selection mechanism, the total number of particle at site is 
clearly given by 

K, = 9n +% and = Card{|y„| <j<N: = C) 

where 

{bl ...,b^) “IT- Multinomial(iV„, W^..., <) with W* = 

*( 11 . 22 ) 




406 11. Feynman-Kac and Interacting Particle Recipes 



The N-valued random variables (hn)i<><^ interpreted as the rsmdom 

munber of births of the individuals (^)i<»<Ar- Note that we have 

N 

E{bUu) = e(5;|$„)+e( ic*(e)Un) 

J=l 9 nl +1 

= Gniii) + (1 - m{U){Gn)) Gn(e;)/m($n)(Gn) 

= GniO/miUiGn) 

and therefore, with some obvious abusive notation 
1 ^ 

Un) = «'n(m(^n)) 

i=l 

In much the same way, the local branching Lj-niean errors are given by 

E((E!1i bUni^n) - N m{^n)Sr.MUifn)?\^n) 

— N ^{^n)Sn,m(^„){fn ~ ^n,m({„)(/n))^ 

= N ^n(m(^„))(/n - ^„(m(C„))/„)2 - N m{U{Gl{fn ~ ^n{m{(n))fn?) 

= N «/„(m(?„))(/n - ^n(m(^n))(/„))2 

-N m(€n)(G„) ^„(m(4„))(G„(/„ - ®n(m(C„))(/„))2) 

It is also instructive to compare the local variance of this model with the one 
of the simple genetic model associated with the McKean selection kernel 

^n,t]ni^n> dXfi) ~ ^ n{Vn)i^n) (11.23) 

In this situation, the selection transition simply consists in sampling ran- 
domly N conditionally independent random particles with the same 
discrete distribution ^n{fn{^n)) The number of particle have selected the 
i-th site 

6; = Card{l<j<AT: ^ 

can be interpreted as the number of births of the individual and we have 

{bn,..., bn) '*=■ Multinomial(JV, Wn,...,Wn) (1 1-24) 

The local error variance is now given by 



E((Elll Kfni^n) - N m(en)5„.,n«„)(/„))2|^„) 
= N ^„{m{^n)){fn - ^n(m(C„))/„)2 




11.8 Branching Selection Variants 407 



We readily observe that this local variance is greater than the one of 
the interacting model associated with the McKean transition (11.21). In 
comparison to (11.23), the McKean selection transition (11.21) contains an 
important extra acceptance term which allow particles with high poten- 
tial to stay in the same level. In this sense, the simple genetic selection 
contains too much randomness. For instance, in the somewhat degenerate 
case where G„ are constant functions G„ = 1, the local selection variance 
associated with the transition (11.21) is null while the one (11.23) is equal 

to m(^n))(/n ~ *^(^n)/n)^' 

The question discussed above is clearly related to the Efron and the 
Bayesian weighted bootstrap theory (see for instance [25] and references 
therein). Roughly speaking, the original idea of the bootstrap is that a 
sample empirical measure calculated from an observed sample distribution 
is sufficiently like the unknown distribution of the observations X. As & 
result, we can use the bootstrapped sample to estimate some statistical 
quantities of the unknown distribution such as the mean or the variance. 

To better connect the bootstrap ideas with the previous discussion, it 
is convenient to interpret X = {X')i<i<N as N independent observar 
tions of some unknown distribution r}. In this interpretation, 'i{m{X)) 
is an observation of the unknown updated distribution and finally 
X = (X‘) i<i<N is the corresponding bootstrapped sample. The question 
addressed above is now essentially equivalent to finding a bootstrap strat- 
egy with respect to the observed distribution ^^(m(A’)). Using some stan- 
dard abusive notation, if X = (X )i<i<N represents a sequence of iid ran- 
dom variables with common law then for any test function / and as 

iV ^ 00 we have 



y/N{m{X)-m)if) 

with a^if) = — 'J' (»?)(/))*)• On the other hand, using the same 

techniques as those presented in Chapter 9 we prove that 

v/iV(m(X)-^(m(X)))(/) Ar(0,<r(/)) 

The variances a{f) in the two situations examined in (11.23) and (11.21) 
are respectively given by <r(/) = ff{f) and by 

These rather elementary comparisons indicate that whenever the observed 
distribution have the form '9{ri) = rfSrj we expect to construct more accu- 
rate estimates when bootstrapping in accordance to Srj. 




408 11. Feynman-Kac and Interacting Particle Recipes 



11.8.2 Description of the Models 

The idea behind the forthcoming construction is to replace the multino- 
mial branching transition by different type of distributions such as Poisson, 
Bernoulli, Binomial and others. As an important rule, we keep the accep- 
tance mechanism and we only change the distribution of the branching 
numbers b\. Om models will not be restricted to fixed population size but 
they take values in the state space 

En=U(Wx^) 

p6N 

with the convention Efl = {A} a cemetery point if p = 0. The parameter 
p 6 N represents the size of the system. The initial number of particles 
./Vo € N is a fixed non-random munber which represents the precision par 
rameter of the brandling approximation model. The evolution in time of 
the branching model 

(ATn,U — ^ (iVn+l,en+l) 

is conducted as follows. As before, the initial particle system = (^o)i<«<^o 
consists of No independent and identically distributed particles with com- 
mon law Tjo- At the time n, the particle system consists of iV„ particles. If 
Nn = 0, the system collapses, and we let (/Vp,$p) = (Ap+i,^p+i) = (0, A) 
for any p > n. Otherwise, each particle branches into a random number 
of offsprings b^, 1 < i < 7V„, and the mechanism is chosen so that the 
following un-bias and L 2 conditions hold. For any /„ G Osc(.E„), we have 
on the event {/V„ > 0} 

I Cn) = A^n^n(m(^n))(/n) 

fcj./n(^) - NMm{U)iU)]^ |$„) < c iV„ (11.25) 

At the end of this stage, the particle system consists of Nn = b\ 
particles denoted by 

fc-i fc-i 

i<k^Nn I + (H-26) 

1=1 1=1 

The mutation transition is defined as before, except if we have iV„ = 0. In 
this case, the system dies, and we set (Np,^p) = (Np,^p) = (0, A) for 8my 
p> n. 

The iVo-approximation measures of the Feynman-Kac distributions 
and T)n are respectively defined by 




11.8 Branching Selection Variants 409 



with the convention that = 0, the null measure on By construction, 
we observe that 



=liVn>o and ^7n = E 

In terms of these particle measures, the conditions (11.25) are equivalent 
on the event {iV„ > 0} to the following ones 

^0E(e(/n)ICn)=iV„^n(T?^)(/n) 

E((No e(/n) - NMvMn)]^ l^n) < C iV„ 

Finally, observe that on the event {Nn >0} we have 

E( W(/n) - No^nivMn)? |^n) 

= E([ATo e(/n) - NnMvMn)? ICn) + [No ~ N„? IMvMn)? 

< c (AT„ + [No - Nn]^) 
which in turn implies that 

E(l«^(/n) - *n(>?^)(/n)i" l«n) < C + |1 ‘ ''n/Wol") (H 27) 

Finally, we observe th^ size of the s)r8tem doesn’t change diuring the ex- 
ploration phase Nn = Nn-i, and we have the unbias property 

E(»?^'(/„)Un-l)=ClM„(/„) 

and the local sampling L 2 -estimates 

No E((»?^^(/„)-»7^iM„(/n)]2 I f„_i) = vH-lMnifn-Mnifn)? <cNn/No 

11.8.3 Some Branching Selection Rules 

In Section 11.8.1, we have already presented two multinomial type strate- 
gies (11.22) and (11.24). These two interacting models correspond to the 
two different choices of McKean selection transitions given (11.21) and 
(11.23). We have also seen that the first McKean interpretation (11.21) 
with the multinomial branching (11.22) reduces the local variance of the 
branching selection model. 

In this section, we provide a series of branching laws which satisfy the 
pmr condition (11.25). To clarify the presentation, we restrict our discus- 
sion to branching interpretations of the second McKean model (11.23), 
thus leaving aside the improvement we can get in practice by using the 




410 11. Peynman-Kac and Interacting Particle Recipes 



first McKean interpretation. Nevertheless, we emphasize that all the forth- 
coming branching rules can be combined without further work with the 
acceptance/rejection McKean selection transition (11.21 ). 

We describe branching selections at a given time n, and it is also im- 
plicitly assume that the system at that time has not collapsed. In other 
words, all the forthcoming laws are defined on the event {N„ > 0}. We 
also simplify the notation and we set 

We recall that [aj (respectively {a} = o - [oj) is the integer part (resp. 
the fractional part) of a number a € R. 

• Remainder stochastic sampling rules. Each particle first branches 
directly into a fixed and “deterministic” number of ofisprings = 
[N„W‘J so that the intermediate population consists of =’ 
^ particles. It can be seen that at least one particle has one 
ofepring. Otherwise, we would have 



Vi J = 0 =► Vi NnW* = {NnW* } < 1 

which woidd contradict the fact that N„ = NnW‘. There- 
fore, using this preliminary deterministic branching rule, the particle 
model never collapses. Nevertheless, to ensure that the unbias and 
L 2 conditions are met, we need to introduce an additional branching 
rule. Two strategies can be underlined. 

One natural way to keep the size of the system fixed is to introduce in 
this population N„=Nn - N„ = additional particles. 

To do this, we introduce an additional sequence of branching numbers 



**=■ Multinomial(./V„, 



«n V'«n /v'wn/ 



{W„IV4} ’ • • • ’ D"", {AT„IV4} 



and we set 6},. In other words, each particle again pro- 

duces a number of b\ additional offsprings. Note that the multinomial 
random nmnbers (11.28) can alternatively be defined as follows 



6*=Card{l<i<N„; e = l<k<Nn 



where (li, . . . , in" ) are independent random variables with com- 
mon law 5^1^, I ^ easily checked that (11.25) are 

Zwjal l^nVVn/ ** 

satisfied. 




11.8 Branching Selection Variants 411 



We can alternatively use the independent Beraoulli resampling num- 
bers defined 

m = i|^n) = 1 - m } (11.29) 

Also note that condition (11.25) is met since we have 

m\U) = 

Var(6i,|$„) = Var(6;|:r„) = {Ar„W‘}(l-{Ar„W’})€[0,^ 

(11.30) 



• Independent branching numbers The Poisson branching numbers are 
defined as a sequence b\ of conditionally independent random num- 
bers with distribution given for any fc > 0 by 



(N W’l*' 

F{b\ = k\^r.) = exp (-AT„ W^) (11.31) 

Since we have E(6J,|F„) = = V(6J,|J^„), we readily check that 

conditions (11.25) are met. The binomial branching numbers are de- 
fined as a sequence b\ of conditionally independent random numbers 
with distribution given for any 0 < A: < by 

m = k\U) = (W^‘)*' (1 - (11.32) 



In this case, the pair condition (11.25) follows from the fact that 
m\U) = AT„W* and Var(6j,|en) = AT„W‘(l-W‘) 

11.8.4 Some h2-Tnean Error Estimates 

In this section, we investigate the asymptotic behavior of the branching 
particle models described in Section 11.8.2. Our approach follows essentially 
the same Unes of arguments as the one developed in Section 7.4.3. For a 
brief overview on the Feynman-Kac semigroups involved in the forthcoming 
analysis, we recommend the reader to start his/her study with Section 7.2 
and Section 7.4.3. Oiu: first task is to quantify the probability of extinction. 

Lemma 11.8.1 The total mass process Nn is a non-negative and integer 
valued martingale with respect to the filtration Tn = <r{^p,P < n). In addi- 
tion, we have for any n > 0 

No E( sup (1 - Np/No)^) <cn and No P{Nn = 0) < c n 

0<p<n 




412 



11. Fe)mman-Kac and Interacting Particle Recipes 



Proof: 

To check the martingale property, it suffices to notice that 

Nn 

= E(iv„|:F„) = 52 E(6j,|:F„) = 

»=i 

By Doob’s maximal inequeJity, the proof of the L> 2 -estimate amounts to 
prove that E(iV^) < N^ + cn Nq. The latter is easily checked by induction, 
and using the fact that 

Nn-l 

E((iV„ - = E{( 52 (Cl - < cNn-l 

»=1 

The second estimate is a simple consequence of the first one. To see this 
claim, we use Chebichev’s inequality to check that for any e €]0, 1[ 

P(iV„ = 0) < P(iVn<eiVo)<P(JVo-iV„>(l-c)iVo) 

m 

< n\No-Nn\>{l-e)No)<j^^^^ 

We end the proof of the lemma by letting e -> 0. 

■ 

This simple lemma, combined with an elementary induction with respect to 
the time parameter, already gives some useful L 2 -mean error estimates for 
the branching ^proximation model. To describe this technique, we recall 
that Nn^i = Nn = ^^=1 > 0 > 0, from which we find the 

decomposition 

\Vn+l-Vn+l\ ljV„+,>0 

< \Vn+l - VnMn+l\ - «'„(t?^)]M„+i| 1jv„>0 

+ \^n+liVn) - ^n+liVn)\ 1n„>0 
Now, on the event {iV„ > 0}, we notice that 

[^n+l(^n^) “ ^n+l(^ln)I(/n+l) = , [Vn ~ Vn]{fn+l) 

Vi!{Gn) 

with the functions /„+i and G„ defined by 

fn+l — Gn (Afn+l(/n+l) ~ ^n+l(^n)(/n+l)) and Gn — Gn/l1n(Gn) 

This readily yields that 

\{^n+l{Vn)-^n+l{rin)]{fn+l)\lN„>0 < C - 7?„](/n+l)| 1 n„>0 




11.8 Branching Selection Variants 413 



On the other hand, by definition of the branching model, and using (11.27), 
we have that 

E(E(((»/^[^i -eA^n+l)(/n+l)P |$n,fn)liv„>oKn) 
<cE{NJN^) = cNn/N^ 

and 

- ^nivi!){fn)? |Cn)l^„>0 < C {NJNi + [1 - NJNo]^) 

Using Lemma 11.8.1, if we set 

IH = s/n'o sup E(((t,^ - t,„](/„) 1 ^„>o) 2)'/2 

/„eOsc(En) 

then we find that /^x < c [/^ + fro™ which we get the rather crude 
estimate 

sup sup E(([r?^ - t?„l(/„)ljv„>o)^)‘/^ < c” 

^o>i /„€Osc(e„) 

for some c> 1. One drawback of the elementary induction technique pre- 
sented above is that it over-estimates the L 2 -mean errors. To get one step 
further, we first note that the brtmching rules performed at each stage of the 
algorithm can be interpreted as local perturbations of the limiting evolu- 
tion equation. To improve the above rather crude estimation, it is therefore 
convenient to refine our analysis so that to enter the stability properties 
of the limiting measure-valued process. As usual, these properties are ex- 
pressed in terms of the pair parameters (rp,„,/3(Pp,„)) introduced in (7.3) 
on page 218 (see also Section 7.4.3). 

Theorem 11.8.1 For any n > 0, and fn € Osc(P„) we have 

y/Ifo E (|[»7^' - Vn]ifn) 1n„>o?)^ < c f; (1 + «Tp(lV)) rp.„ /3(Pp,„) 

9=0 

with <Tp(iV) = No Var(lVp_x/iVo) and the convention N-i = Nq forp = 0. 

Proof: 

The proof is based on the following decomposition 

n 

ivi! - Vn) 1n„>0 = Yi 1n„>0 (H.33) 

9=0 

Arguing as in the proof of Theorem 7.4.4, we readily prove the inequality 
lAr„>0 - ^9,n(^9«-l))](/n)l 

< U ,>0 r,,„ ^(P,,n) |[<--^9(»?9^-l)l(Ol 



(11.34) 




414 11. Peynman-Kac and Interacting Particle Recipes 



with the random function = Q^n(/n)/IIQ^n(/n)ll random 

operator from Bb{En) into Bb{Eg) defined in (7.25), page 245. To take 
the final step, we use the estimate 

l|V.>ol|>)J' - *,(<-1)1(01 <h+h 

with 

A = ls,-.xill<-<-iJ^,l(Ol 
h = li,.-,>oll<.,-Vi«-i)l(M,Ol 

By definition of the branching particle model, we have 

= lir - W,(01"(§-i) < c N,/NS 

and 

< c{Ng.i/N§ + [l-Ng.i/No?) 

By Lemma 11.8.1, these almost sure estimates yield that 

E(7?) V E(/|) < c (1/No + E([l - Ng-i/No]^)) 
from which we conclude that 

E(ljv,>o|« - < c {l/y/No + E([l - Ng.i/No]^)^^^) 

The end of the proof is now a simple consequence of (11.33) and (11.34). ■ 

Corollary 11.8.1 We further assume that the regularity conditions (G) 
and (M)m are met for some parameters (m,e„(G),e„(Af)) such that 

e{G) = A„e„(G) > 0 and e(M) = Ae„(M) > 0 

Then, for any n > 0, and fn G Osci (£?„), «;e have 

ymE{\[r,lf-rjrMfn) 

< c m (1 + [iVo Var(Ar„/Aro)]^/2)/(e(G)2”»-i£3(M)) 




11.8 Branching Selection Variants 415 



In particular, we have 

•M E (Ihir - InK/J * < C ra (1 + 

and for conservative particle models (i.e. N„ = No), we have the uniform 
estimate 

y/N^ supE (|(»?^ - T?n](/n)H^ < c m /(e(G)^”*"‘e^(M)) 

n>0 

Proof: 

Arguing as in the proof of theorem 7.4.4, we have the estimates 
0{Pp,n) < (1 - and rp,„ < e-\M)e-”'{G) 

from which we find that 

E '•^.nW.n) = (1- 

g=0 q=0 

[n/m\ 

< me-\M)e-”'{G) X) " c^(M)€”*-‘(G))*' 

k=0 

Now, recalling that Nn is a martingale we see that E(JV^) = E{{N)n) is an 
increasing sequence, and by Lenuna 11.8.1, we have for any p<n 

Var(Np/No) = (E((N/iVo)p) - 1) < WaiiNJNo) < c n/No 

The end of the proof of the corollary is now a simple consequence of the 
estimate stated in Theorem 11.8.1. ■ 

Mimicking the product formula presented in Lenuna 2.3.1, we adopt the 
following 

Definition 11.8.1 The No-particle approximation measures 7 ^ of the un- 
normalized measures 7 n are defined for any fn € Bb{En) by the product 
formula 

7n (/n) = Vnifn) H ”*(^p)(<^p) 

p=0 

Proposition 11.8.1 For any n > 0 and fn € Bb{En), the random se- 
quence defined by 

rp.n(/n) = VNo (7p^(Qp,n(/n)) ~ 7p(Qp,n(/n))] , P < n 
is an T -martingale with increasing process given by the formula 
No (r..„(/n))p 

= ELo lAr*-.>o(nto'm(^0(G:)P 



X E{{[Nov^ -Nk-i^k{vi^_,)m,nfn)? I n-i) 




416 



11. Feynman-Kac and Interacting Particle Recipes 



Proof: 

We use the decomposition 

7n^ (/n) ~ 7n(/n) = {Qp,n{fn)) ~ 7^l(Qp-l,n(/n))] 

p=0 

with the convention 7^iQ_i,n = %Qo,n = 7n for p = 0. Notice that 

N 

7p^-l(Qp-l,n(/n)) = Up_.>0 ^p(m(ep-l))(Qp.n) H 

° fc =0 

On the other hand, since we have 

ljVp>0 = ljVp_i>0 + liVp_i>0,Afp>0 - lWp_i>0 = 1/Vp-1>0 - lATp_i>0,JVp=0 

and ljVp=o, 7 srp_i>o Vp = 0, we find that 

p-i 

l^iQpM = H>0<(Qp,n(/n)) 

fc =0 

P-1 

= liVp_.>0<(Qp,n(/n)) 

fc =0 

Recalling that ^p{np_{) = $p(m($p_i)), this readily yields that 
7^^(/n) - 7n(/„) = Ep=o lNp_»o in?;S m($*)(Gfc)l 

X [<«?P.n(/n)) - ^^p«-l)(Qp.n(/n))] 

By construction of the branching model, we have for any test function 
fp € Bb{Ep), and on the event {Np-i > 0} 

, N,-i 

B«(« Up-i,«p-i) = jf^'Ei’U'^pUpXiU) 

firom which we conclude that 

No E{ri^(fp) I Vi) = Np_i$p(m($p_i))(/p) 

The end of the proof is now clear. ■ 

Corollary 11.8.2 For any n > 0 and /„ € Bb{En), we have 

E(7n (/n)) = 7n(/n) and sup No E([7;J' (/„) - 7n(/n)l* ) < 00 

No>l 




11.8 Branching Selection Variants 417 



Proof: 

In view of (11.25), we observe that 

E{{[Novi^ - Nk-lMnk-i)]{Qk,nfn)? I < C Nk-l 
The end of the proof is now a simple consequence of Proposition 11.8.1. ■ 



11.8.5 Long Time Behavior 

In Corollary 11.8.1, we have presented a pair or regularity conditions on the 
Fe}mman-Kac models which ensures uniform L 2 -^imates with respect to 
the time parsuneter. These asymptotic properties are essential in practice 
to calibrate the initial number of particles needed to achieve a desired 
precision degree. The main difficulty in the study of the long time behavior 
of branching models with independent branching numbers is that the total 
size process 

n n ^p—i 

Nn = No + - Np_i) = No + '£'£{b;- 

P=1 P=1 t=l 

is an martingale with increasing process 

{N)n = iVo' + EEE((&^-W)'l^p) 

P=0 i=l 

The only way to ensure a uniform convergence result is to ensure that 

sup E{{N)l) = Ee (|lVp - 7STp_x|2) < oo 

Unfortunately, these processes are usually far from being uniformly inte- 
grable and we confess that we haven’t find a particle model with condition- 
ally independent population size which met this integrability property. If 
we consider the Poisson branching numbers, we have 

E ((6‘ - N^w;f/Fp) = N^w; VI < i < iVp_i 

from which we find that NqE [{t}^ (1) - t;„( 1))^) = n (-> oo as n -> oo). 
In the same vein, for the binomial branching model we have 

E {{bi - N^w;fiF^) = N,w;(i - w;) 

If we assume that l/o < Gn{x) < a for some a > 1 and No > o? then one 
gets NpWp{l - Wp) > Np- 0 ?, which again implies that 



n-¥oo 




418 11. Feynman-Kac and Interacting Particle Recipes 



Although the Bernoulli particle model seems to be the most efficient one 
(since the independent random variables have minimal variance), the 
forthcoming elementary example shows that, even in this case, one can- 
not expect to approximate the desired measures uniformly with respect to 
time. Let us assume that the state space En = {0, 1} and the pair poten- 
tied/transitions ((?n,Mn) are homogeneous and ^osen so that 

G(l) = 3G(0) > 0 M(x, dy) = u{dy) = ^6o{dy) -H 

In this case, ^ are Np independent random variables with common law i/ 
and we find that 

Ve>0 P(I^^CK')-i-(G)|>eG(0)|W,)<j5^ (11.35) 



Noticing that v{G)/G{0) = 2 = Zv{G)/G{\) and G(0) < G(l), on the set 

r 1 Afp 



'1 



we have that 



and 



G(0) 



-i|< 



<1 



I- 



2 '- 2 ( 2 - e )-2 
G(l) 3. . 3e 



2- - 2(2/3 -e)- 2 

as soon as e € (0, 1/9). This, in turns, implies that 



4i< 



<»s 



G(0) 


— - O ft riH 


G(l) 




— U allU 


iE,'l’iG(4) 



= 1 



and 

4 = 0 => {JVpip;} (i-lAfp^ajii-o' 

4 = 1 =s. {WplV*} (l-{ATpH^})>l(l-9<)= 

It is then clear that on the set fie we have the lower bounds 

Eiib], - NpW;)^ I J^p) = {NpW;} (1 - {ATpW*}) > 1(1 - 9e)2 

This, together with (11.35), shows that 

i=l 

One concludes that E ((r/^ (1) - >7n(l))^) ^ 5^(1 ~ 9^)^ (t oo as n t oo). 




11.8 Branching Selection Variants 419 



11.8.6 Conditional Branching Models 

In this section, we show that the interacting particle model (with multino- 
mial branching laws) can be obtained by conditioning a Poisson branching 
particle model to have constant population size. For any A/q > 1, we denote 

(ft, {Tn, ^n)n>0. (-^n. Cn, Cn)n>0. ) 

the canonical Markov model which realizes the Poisson branching particle 
model starting with Nq particles, and by PJJ® the distribution (on the 
canonical space) of the multinomial branching genetic model discussed in 
(11.24). 

Proposition 11.8.2 For any A € V„(.^n V !Fn), we have 

rf!JA\N = No) = ni{A) PZ-a.s. (11.36) 

Proof: 

Conditionally on the event {N = A/q} = Dn>o{'^'> ~ we have 

(^n,fn)€(E^'”xf;^) Kl-a.S. 

On the other hand, by construction of the mutation transition, we have for 
any n > 0, x, z € E^° 

(C„ € dz\N = ATo,f„_i = x) = (Cn e = x) 

Since changes in the number of particles only take place at branching selec- 
tions, to prove (11.36) it suffices to check that for any n > 0 and x, z 6 E^° 

K (^n G dz\N = ATo,e„ = x) = P{?; G dz|$„ = x) 

By definition of the Poisson branching model, we have for each n > 0, 
X € E'^o, and G 



K^O = k\N = No, ^n = x) 



Klibn = k\Nn = No,^n = x) 

— TT exp(-JVoW'<) 
Z{n,No) A A ° ki\ 



with IVjJ = 



^ and the normalizing constants 

Dj=l 9n(*') 



Z{n,No) 



No 

^ n exp(-AfolV’) 

ki-^...-^ksQ=No i=l 



ki\ 



= e~^° N^°/Nq\ 



It is now not difficult to see that 
(bn = k\N = No, = X) = 




420 11. Feynman-Kac and Interacting Particle Recipes 



and therefore P” (bn — ^1-^ = ^o> = PJ? ® (bn = k\^n = *)• The end 

of the proof is now clear. ■ 

If we use multinomial branching laws, one still has the freedom to adapt 
the size parameter so that to produce a given number of offsprings. To this 
end, let a = (on, n > 0) be the path numbers of offsprings we want to have 
at each stage of the algorithm (i.e. No = <io,Ni = oi, . . . , iV„ = On, . . .). 
The corresponding branching laws are defined by replacing at each time n 
the law (11.24) by the multinomial distribution 

(6i, . . . , 6“") = Multinomial (a„+,, . . . , (11.37) 

We let be the distribution of the particle model with multinomial 

branchings corrections (11.37) (and starting with Nq particles). Arguing as 
above, one proves that 

Proposition 11.8.3 For any A € Vn(J>i V Fn) we have 

Kl(A\N = a) = ^,^<^^(A) F^,l-a.s. 

The continuous time version of Proposition 11.8.2 was proved by Etheridge 
and March in [134] in their study of the connections between critical branch- 
ing superprocesses and the Fleming- Viot interacting particle systems. The 
continuous time version of Proposition 11.8.3 was proved by Perkins [265] 
in his precise study of the structural properties of Dawson- Watanabe and 
Fleming- Viot processes. 



11.9 Exercises 



Exercise 11.9.1: [Boltzmann-Gibbs and Feynman-Kac models] We recall 
that the Boltzmann-Gibbs transformation of a given distribution i/ € P(E) 
with respect to some nonnegative potential function G with u(G) > 0 is 
the measure defined by ^(v)(dx) = G(x) i/(dx)fv(G). Suppose that E is a 
Cartesian product E = (Eq x . . . x En) of some “elementary” measurable 
spaces and the pair potential/measure (G, u) has the form 

JJ ^p(®p) 

p=0 

n 

Tfy(dxo) JJ Mp(xp_i,dip) 

p=i 

for some Markov kernels Mn from En-i into and a distribution fjo ^ 
'P(Eo) and for some sequence of potential functions Gn on E„. Check that 



G(xQj . . . , Xn') — 
v(d(xo,...,dXn)) = 




11.9 Exercises 421 



the Boltzmann-Gibbs distribution coincides with the Feynman-Kac 
distribution on path space 

Qn(d(Xo, • • • , Xn)) = \ Gp(Xp) i Pn(d(XQ, . . . , X^)) 

-2n (,p=o J 

where P„ is the distribution of the trajectory (Xp)o<p<n of a Markov chain 
with initial distribution r/o and transitions Mn- 



Exercise 11.9.2: [Sequential Monte Carlo integrations] Let {E'„,€'„)n>o 
be a sequence of measurable spaces, and let 7r„ € M+{En) be a sequence 
of positive and bounded measmes on some Cartesian products E„ — {Eq x 
. . . X E'„) with 7rn(l) > 0. Suppose that we want to evaluate for any n > 0 
and any bounded measurable function /„ on En the integrals 

ir„(/n) = / /n(xo, . . . , X„) 7r„(d(xo, . . . , x„)) 

Jb;,x...xE'„ 

FWther assume that the measures 7 t„ can be disintegrated in the sense that 



^n(^(®0) • • • ) ®n)) — ^n— l(d(xo, • • . , X,j_x)) l,n((2^0) • • • > ®n— l)i dXn) 

(11.38) 

for some collection of measurable transitions from E„ into E'^^i 

with 

Gn{Xo , . . . , Xn) = ^n,n+l(l)(®0) • • • t ®n) ^ 0 (11.39) 

Let X'„ be the nonanticipative sequence of E^-valued random variables with 
initial distribution r/o(dxo) = 7ro(dxo)/7ro(l) and “elementary transitions” 



n( v> /- I v' V' _ \ ^n— l,n((®0> • • • ) ^n— 1)> dXn) 

P(X„ G dx„ I Xo = Xo,. . . , A„_1 = x„_i) = — T— T- ^ 

Show that the random path sequence defined by Xn = {Xq,...,X^) is 
an En-valued Markov chain and we have the Feynman-Kac representation 
formula 

7T„(/n) = 7To(l) ^/n(X„) n Gp(Xp)j 



Exercise 11.9.3: [Restricted Markov chain models] Let be an 
valued Markov chain with initial distribution ttq and elementary Markov 
transitions P|^. Also let A„ e ^e a given collection of measurable sets 
such that ito(Ao) > 0 and P,K(xn-i,An) > 0 for any x„_i 6 A„_i. We 
denote by ir„ the distribution of the random paths restricted to the tube 




422 11. Feynman-Kac and Interacting Particle Recipes 



(Ilp=o -^p)- ^0*^® formally, 7r„ is defined for any /„ 6 56(rip=o by the 
formula 



7Tn(/n) = E(/„(yo', ■■■X) UoX...x>»„(yo'. • • ■ X)) (H-40) 

We denote by X„ the Mtirkov chain from An-i into An with initial distri- 
bution T}o(dxo) = ir(dxo)lyio/’’‘o(A)) elementary transitions given for 
any x„_i € An-i by 



Mpx„_i,dx„) = 



Pni^n—lt^n) li4n 



li •'4fi) 

• Show that 7r„ can be rewritten in the Feynman-Kac form 

/ n > 

7Tn(/n) = MAo) E fniX'o, • • • 

V p=i > 



• Prove that the multiplicative property (11.38) holds true on the sets 

(np=o^p)“‘d 

7r„-i,„((xo, . . . ,x„_i),dx„) = /^(x„_i,i4„)M^(x„_i,dx„) 

Check that the corresponding particle simulation models can be in- 
terpreted as an interacting acceptance/rejection technique. 



Exercise 11.9.4: [Maxima distribution functions] Let be a nonanticipa- 
tive sequence of random variables taking values in some measurable spaces 
(E'ntEn)- Also let K» be a sequence of measurable functions on E'^. If we 
take in (11.40) the sets A„ = KrH(~oo,l]) for some I e R and then prove 
that ir„(l) = P(supp<„ Vp{Y^) < 1) 

7T„ = Law(Fo', . . . , K ; sup VpiY^) < 1) 

p<n 

Design a particle approximation model of these quantities. 



Exercise 11.9.5: [Hitting time probabilities] We consider time-homogeneous 
state spaces E„ = E and a given measurable subset An = A e E. Check 
that the prediction Feynman-Kac measure Xn introduced in Exercise 11.9.3 
coincides with the law of a Markov path (Vq , . . . , YJ^) given the fact that it 
has never exited the set A after n steps; that is, for any /„ 6 Bb{E^'^^) 

Vnifn) = irn{fn)Ml) = E{fn{Yi,...X)\T>n) 

n 

7T„(1) = P(T>n) = n<(A) 

p=o 




11.9 Exercises 423 



where rf„ stands for the nth time marginal of Tjn and T the first time 
enters into {E — A). The estimation of first passage probabilities arises 
in various engineering problems, including catastrophic failures, buffer ex- 
ceedance overfiows, and financial ruin processes. Check that 

n 

= P(T > n I T > n) and P(T > n) = J]P(T > p | T > p) 

p=0 



Finally, prove that P(T = n) = 7t„(1) - 7r„+i(l) = Hp=o 

and construct a particle approximation model of these quantities. 



Exercise 11.9.6: Let = R^, and let (Gn,Afn) be defined by the 
formulae 






5n(!/n? ^n) 

def. exp |-^(yn - CnX„)r"^(y„ - CnXnf 



and 



dXn) 

= {-5(®n - On(x„-i))gn “ 0„(x„-i))^} 

where represents the transpose of a colunm vector z, j/n G is a 
given d'-dimensional vector, qn,rn are {d x d) and synunetric and 

nonnegative matrices, Cn is a (d' x d)-matrix, and finally an is a bounded 
drift function on R^. In this example, we see that regions with high potential 
correspond to regions where Cn^n is close to the fixed and given parameter 

Vn- 



1. Check that for any fixed vector Xn-i the function 

e R" X R'f 9„(lh,x„) 

is the joint density of the {d + d!) Gaussian vector 



(Xn, Yn) = (On(x„_i) -h C„X„ Vj.) 

where (IFn, V„) € is a pair of independent centered Gaussian 
vectors with covariance matrices E(WnWj) = g„ and E(V^V^) = r„. 

2. Show that the conditional Markov transition Mn(xn-i, .) introduced 
in (11.16) coincides in this situation with the Gaussian distribution 




424 11. Feynman-Kac and Interacting Particle Recipes 



on R** with mean mn(Xn-i) and covariance matrix s„(x„_i) defined 
by 

= a„{x„-i) + qncl{cnqnc^ + r„)~'^{y„-Cr,an{x„-i)) 
Sn{Xn-l) = 9n - + r„)"*C„g„ 

3. Also prove that the corresponding potential functions G„ = M„(G„) 
have the form 

Gn(Xn_i) 

« exp{-i(j/„ - c„a„(x„_i))(c„gncj + r„)"*(pn - c„an(a:n-i)r} 

Describe the mutation/selection transitions of the particle model as- 
sociated with the pair (Gn, Mn). 



Exercise 11.9.7: [Baker’s selection [22]] We consider Baker’s remainder 
stochastic scheme introduced in (11.28). Check that E(M‘ j X) = {iVW*}, 
and conclude that for any / 6 Bb{E) we have the tmbiased property 
E(m(X)(/)|X) = 4'(m(X))(/). RecaUing that W' = G(A<)/Ef=i G{X}), 
show that 



N 



E 



{W»} 
Ef=i {NWi} 



5xi = nm{X)) 



with the Boltzmann-Gibbs transformation 'J' defined by the formula 



V(r/, /) € V{E) X BbiE) $(r,)(/) = r/(G,/)/f?(G,) 



with the potential function G,,(x) = {G(x)/? 7 (G)}. FVom the question 
above, deduce the equivalence in distribution 

N im{X) - nm{X))) = N {m{X) - $(m(X))) 1^>„ 



where m{X) = j- represents the empirical measure associated 

mth a sequence of N independent random variables with common law 
’9{m{X)). Using Lemma 7.3.3, prove that for any p > 1 and osc(/) < 1 we 
have 

y/N E(|(m(X) - «'(m(A-)))(/)]P ] X)^^ < d{p)^^ 
with the sequence of constants d(p) defined in (7.7). 



Exercise 11.9.8: Let (e<)i>i be a sequence of independent random variable 
with common exponential distribution 

P(ei € dx) = A e“'^* 1 r^ (x) dx 




11.9 Exercises 425 



We 6^80 set Ti — 0*{eu • • • , ew) =<fc/. 5Ifc=i for each i < N. Check that 
6 = {ff*)i<N is a diffeomorphism between Rijf and the set 

C'tv = {(ti,..-,tiv) : 0 < ti < . . . < tA/} 



Prove that 

= -ts-i) and Jac(0"^) = 1 

Conclude that 

P((Ti,...,Tjv) € d(ti,...,tjv)) = lcjv(*i, — ,tN) e“^‘~ dt\...din 
If we set Tn^i =Tn + ejv+i, then check that 

P(^^+lGdt) = e-^‘ iR^(t) [/ dti...dtN]dt 

= Ae-*‘ ^^lR^(t)dt 
Using the decomposition 

= (AM lo<ti<...<tN<tw+i ^jv+i dti...dts) 

X (A dtN+i) 

prove that 

P{{Ti/Tn+u ■ ■ ■ ,Tn /Tn+i,Tn+i) € d{ui, . . . ,UN,tN+i)) 



= P((Ti/TAf+i,...,T;v/r^+i) G d{ui, . . . ,un)) X P(Ta^+i G dts+i) 
with 

P((Tl/^^+l,...,T^/^^+l) G d(«i,...,UAr)) 

= AM lo<tti<...<ui^<i dtii . . .dufi 

Conclude that (Ti/Tat+i, . . .,Tn/Tn+i) is an uniform order statistic. De- 
scribe a simulation algorithm to sample the three multinomial type branch- 
ing models described respectively in (11.22), (11.24) and (11.28). 



Exercise 11.9.9: Let Ube a uniform [0, l]-valued random variable, and let 
A > 0 be a given parameter. Show that the random variable e = A log (1/U) 
is an exponential random variable with parameter A. We let (ej)i>i be 
a sequence of independent exponential random variables with parameter 




426 11. FeymnaD-Kac and Interacting Particle Recipes 



A = 1, and we set Tj = Ylk=i Using Exercise 11.9.8, prove that for any 
tn > 0 we have 

f°0 foo M»n-1 



Conclude that 



P(T„ <w< tm+i) = e 



We let b the first time m > 0 we have Tm+i > w. Prove that 

P(6 = m) = e “ — j- 



Describe a simulation algorithm to sample the Poisson branching model 
described in (11.31). 




12 

Applications 



12.1 Introduction 

This rather long chapter focuses on the applications of Feynman-Kac mod- 
eling strategies and their interacting particle interpretations to a variety of 
practical problems. The field of applications includes spectral analysis of 
Feymnan-Kac and Schrodinger semigroups, rare event estimation, sequen- 
tial analysis of probability ratio tests, Dirichlet problenas with boundary 
conditions, directed polymer simulations, and nonlinear filtering problems. 
As an initiated reader will immediately notice, all these problems consist 
of solving a more or less complex Feynman-Kac distribution. At the risk of 
repetition, we have chosen to include this chapter because we felt that there 
is no textbook or journal article that really illustrates the potential appli- 
cations of Feynman-Kac and particle models. In the opposite situation, a 
reader not initiated on Feynman-Kac and particle models is recommended 
to read Chapter 11 before entering into the former exposition. Chapter 11 
leaves out theoretical issues and it guides the reader through most of the 
important concepts and techniques needed in applications. For a more thor- 
ough training on Feymnan-Kac and particle models, it is convenient to read 
Chapters 2 and 3. 

We do not pretend to present in each particular application the most 
efficient algorithm with the optimal branching selection distribution or the 
best choice of exploration excursions. The approach we have taken here 
rather emphasizes the Feynman-Kac modeling of a given estimation prob- 
lem. As soon as we have developed a sufficiently generic Feymnan-Keic 




428 12. Applications 



inteipretation, we roughly design a rather general but basic particle ap- 
proximation model. In general, we leave aside the possible improvements 
we could obtain by using one or the other recipes presented in Chapter 11. 

It is of course out of the scope of this chapter to provide a catalog with 
detailed numerical comparisons between these interacting particle approx- 
imation models and some other more traditional techniques such as the 
extended Kalman-Bucy filter often used in “almost” linear/Gaussian filter- 
ing problems or any other alternative estimation models. To offer a way 
of comparison and to better connect the particle methodology with more 
classical literature on each application area, sometimes we describe partic- 
ular situations where explicit calculations of the desired quantities can be 
derived. These examples can also serve practitioners for testing numerically 
the accuracy of particle models. The proof of these explicit and analytical 
solutions is often housed in a series of exercises at the end of each section. 

To avoid repetition, we will not restate in each particular application 
area all the convergence results we can deduce fi:om Chapters 7 to 10 on the 
asymptotic analysis of particle models. Sometimes we illustrate the impact 
of some asymptotic theorem in a specific application. But as a general rule 
we prefer to give some precise reference to a specific convergence theorem. 

FVom the applied probability viewpoint, the present chapter is certainly 
one of the most important chapters of the book. The interested reader 
can try to develop a collection of particle approximation models in each 
application subject and can also find and interpret a selected asymptotic 
convergence theorem. For a more thorough training on practical estima- 
tion problems, we provide a brief catalog on selected journal articles in 
the applied literature. For applications of particle methods to tracking 
and visual detection of objects, we recommend to the reader the chain 
of articles [17, 148, 164, 212, 282, 143]. Applications of particle meth- 
ods to global positioning systems can be found in [16, 43]. The multi- 
splitting particle analysis of rare events is described in the chain of ar- 
ticles [11, 157, 158, 159, 303, 304, 305]. We also mention applications in 
image analysis [187] as well as in biology with gene estimations in DNA 
sequences [227, 228, 229, 230, 231, 233], data assimilation and inverse prob- 
lems for ocean monitoring and prediction [136, 137, 138, 139, 218], and in 
finance with economic time series [270]. See also the multiauthor book [125] 
as well as the monograph [227] and references therein. 

In each application area, the same Feynman-Kac and particle models are 
often expressed using different language. To guide the reader and to better 
connect our mathematical models with the more “applied” literature on 
this subject, we provide hereafter a short discussion on these different ter- 
minologies. Conditional distributions and filtering problems are one of the 
most typical examples of Feymnan-Kac models arising in various scientific 
disciplines. In weather and oceanography literatme, this estimation prob- 
lem is instead called data assimilation with reference to the huge amoimt 
of observations provided by atmospheric and/or oceanographic measure- 




12.2 Random Excursion Models 429 



ments. Here the updating and prediction transitions are respectively called 
the model analysis and the model forecast (see for instance [139] and refer- 
ences therein). In this context the particle methodology is instead used to 
estimate the error covariance matrices in an extended Kalman filter. Some- 
times the empirical measures are called “ensembles”, and the resulting 
particle approximation models are simplified into the so-c6dled “ensemble 
Kalman filters”. 

In Bayesian literatiure, the filtering model is preferably expressed in terms 
of a Bayes formula relating an “a priori” model with the desired “a poste- 
riori” distribution. The latter measures are sometimes called the “beliefs” 
(see for instance [212]). In this branch of applied statistics, particle approx- 
imation models have taken various names such as “sampling-importance- 
resampling filters,” “condensation filters,” or “bootstrap filters,” but it 
seems that the natural terminology “particle filters” is nowadays adopted. 
We hope that these modern particle methodologies will continue to serve 
as a bridge between the frequentist and Bayesian viewpoints. 

In Monte Carlo Markov chain methods, Feynman-Kac particle mod- 
els are also called “sequential Monte Carlo methods” (often abbreviated 
SMCM) to emphasize probably the nonrecursive drawback of traditional 
Monte Carlo Markov chain methods. In this context, the abstract predic- 
tion Feynman-Kac models on path space are sometimes expressed using 
recursive abstract formulae that basically read 

Q„(d(xo, . . . ,x„)) « Q„_i(d(xo, . . . ,x„_i)) Q„(x„_i,dx„) 

for some positive kernel Qn from -Bn-i into En- Whenever the normalizing 
constants Qn(l) > 0, if we take Gn-i = Qn(l) and Afn = Qn/Qn(l), these 
path measures coincide with the one introduced in (1.3) on page 11 (see 
also Exercise 11.9.2). 

Boltzmann>Gibbs or Feynman-Kac formulae also arise in statistical phy- 
sics, biology, and financial mathematics, and more generally in applied 
probability but in these areas the terminology is rather more stable and 
often coincides with that adopted in this book. 



12.2 Random Excursion Models 

12.2.1 Introduction 

This section focuses on Feynman-Kac distributions on excursion spaces. 
We first design a multilevel modeling technique that reduces the analysis 
of these rather complex functionals to “simple” discrete time Feynman-Kac 
models. Then we apply the particle methodology to solve these formulae 
numerically. To motivate this section, we illustrate the impact of these 
modeling techniques with a brief discussion on some difierent application 
areas. 




430 12. Applications 



In engineering science, these models can be used to represent the law 
of a random process in some rare event regime. These rare events may 
represent a catastrophic failure or a buffer exceedance. For instance, in 
modem communication networks, several packets of information are sent 
from a source to a target destination. During their transmission, they visit 
several nodes in the network. At each node, they wsdt until the service 
capacity of the buffer is sufficiently high, otherwise the packet is lost. In 
practice, the buffers are sufficiently large and these events are hopefully 
rare events. To study the performance of these networks, one is not only 
interested in estimating the probability of overflows but also how these 
events happen. In this particular situation, the corresponding Feynman- 
Kac path model represents the law of the queueing process in this rare 
event regime. 

In physics, excursion models may represent the distribution of a path 
particle in an absorbing medium with hard emd soft obstacles (see Sec- 
tion 12.2.5). For a more practical illustration, we can think of a radia- 
tion source model that emits neutron particles in a containment (see for 
instance [146]). In this context, the absorption potential depends on the 
nature of the shielding environment. The choice of the hard obstacle sets 
depends on the problem at hand. For instance, if we are interested in com- 
puting the probability that a particle escapes the containment before dis- 
integrating in some particular region of the conflguration space, then the 
hard obstacle set will be chosen as this portion of the configuration space. 
We again refer the interested reader to Section 12.2.5 for a more thorough 
study of particle evolution models in an absorbing mediiun with only hard 
obstacles. 

Feynman-Kac excmsion models also provide a natural probabilistic inter- 
pretation of the solution of Dirichlet problems with boundary conditions. 
This subject that is pinched up between partial differential equations, lin- 
ear operators, and probability theory also arises in a variety of engineering 
applications. For instance, in financial mathematics, Feynman-Kac distri- 
butions are often used to model option price evolutions. In this context, 
the hard obstacle sets usually represents some levels at which the option 
becomes worthless, while the potential function is interpreted as an instan- 
taneous interest rate (see for instance the pedagogical textbook of Lam- 
berton and Lapeyre [213]). In applied probability, these models are also 
used to analyze the possible limiting behaviors of a given Markov process 
(see [279]) or to capture the interplay between the geometry of the domain 
and the behavior of a stochastic process as it approaches the boundaries 
(see for instance [268, 295]). 

The main idea behind the forthcoming excursion modeling techniques is 
to decompose the state space into a judicious choice of threshold subsets 
related to the system evolution. This decomposition reflects the successive 
levels the stochastic process needs to cross before entering into the relevant 
rare event. A rough description of the splitting particle method is as follows. 




12.2 Random Excursion Models 431 



When a particle starting at some level does not succeed in entering into the 
next one, it is killed, but each time it enters into a closer level of the rare set, 
it slips into several offsprings. Between the levels, these offsprings evolve 
as independent copies of the stochastic process of interest imtil they reach 
(or do not) an even closer level, and so on. Loosely speaking, the branching 
particles are attracted by gateway regions from which the rare event is 
more likely to happen. In this sense, these excursion splitting techniques 
make the occurrence of rare events more frequent. Thus, they can also be 
regarded as an alternative to traditional importance-sampling methods. 

These branching evolutionary algorithms were originally discussed in 
physics by Kahn and Harris to estimate particle transmission events [177]. 
Since that time, several variations and refinements have been suggested by 
analysts and designers in telecommimication and computer systems. The 
most currently used nowadays is the RESTART algorithm introduced in 
one of the pioneering articles of Villen- Altamirano et al. [303, 304, 305]. 
These models were further developed in a series of three articles of Glasser- 
man et al. in [157, 158, 159]; we also recommend [298] for applications to 
communication networks as well as the article of Garvel and Kroese [153] 
for some details on the computer implementation of these algorithms. Most 
of the algorithms presented in this literature (except [157, 158, 159], which 
are based on judicious Bernoulli simplified models, large deviations, and 
fluctuation analysis) are essentially based on heuristic schemes with no re- 
ally precise mathematical analysis. Moreover it is commonly assumed that 
the transition probabilities between the levels, and thus the desired rare 
event probability are known. 

The objective of this section is to design a novel adaptive particle splitting 
method to estimate these rare events. The central idea is to represent the 
distribution of the process in the rare event regime in terms of a class of 
Feynman-Kac measures in the space of excursion. 

12.2.2 Dirichlet Problems ivith Boundary Conditions 

In this section, we design a strategy to estimate a given Feynman-Kac mear 
sure on excursion space by a Feynman-Kac distribution flow. We illustrate 
the impact of this modeling technique by an original particle interpretation 
of Dirichlet problems with boundary condition. In Section 12.2.4 we shall 
examine related Dirichlet models with boundary hard obstacles. 

We let X'„ be a Markov chain taking values in some measurable spaces 
We let T be a finite stopping time w.r.t. the filtration generated 
by We associate with a sequence of [0, l]-valued potential functions 
on the Feynman-Kac distribution 

7(/) = E(/(T,Xfo,T])nWo.Pl) 

V p=o 




( 12 . 1 ) 




432 12. Applications 



where / is a bounded measurable test function on the excursion space 

E = U„>o({n} X £[o,„,) 

and we have use the notation =<ief. (^p)o<p<n for every n > 0. Now, 
we consider the £?-valued and stopped Markov chain 



■^n — CE ^ ■^lo,TAn]) ^ En — def. Uo<p<n({p} X .£'(o,p]) 

and the potential function (?» on En defined by 






if n <T 
if n > T 



( 12 . 2 ) 

For instance, let X' be the simple random walk defined in Example 2.2.1. 
Suppose Xq G (0, oo) and let T be the first time X' hits 0. In this case, 
the stopped process coincides with the random w«Jk on N where the 
origin is an absorbing barrier. 

We associate with the stopped Markov chain Xn and the potential G„ 
the Feynman-Kac distributions on defined by 



%{f) = ^\^fiXn)YlG,{X,)j 

For more details on stopped Markov processes, we refer the reader to Sec- 
tion 2.2.3. The next proposition allow us to interpret the Feynman-Kac 
measiures in exciursion-spaces (12.1) as the limiting measures of Feynman- 
Kac semigroups. 

Proposition 12.2.1 For any n > 0 and f G Bb{E), with ||/|| < 1, we 
have 

l7n(/) - 7(/)l < 2P(T > n) (— > 0 os n oo) 

and 

0 <%(!)- 7(1) <P(r>n) 

Proof: 

We first observe that 



%{f) = E ( /(Xn)lr<n n ^?P(^P) + E ( /(^n)lT>n 



p=0 



p=0 



and 



E(/(X„)lT<„n^p(^p)) = E(/(r,A'fo.n)lr<nn^p(4.Pl)) 

p=0 p=0 

T 

= 7(/) - E(/(T, 4,t]) n ^p(4,Pl) 

p=0 




Therefore, we find that 

7n(/)-7(/) 



12.2 Random Excursion Models 433 



^ rip=o^p('^(o,p])^r>»») 

from which the end of the proof is clear. ■ 

If 7(1) > 0, then we can defined the normalized distributions 

>7n(/) = 7n(/)/%(l) and »?(/) = 7(/)/7(l) 

6ind by Proposition 12.2.1, we have 

sup |^„(/) - v{f)\ < 2P(T > n)/7(l) (— ^ 0 as n oo) 

/:0SC(/)<1 

To prove this assertion, we simply use the decomposition 



As an aside, we note that 



Qn+l(/)(^n» (^0? • • • ) ^tn)) 



— def. Afn+l(Gn-|-l/)(tn) • • • i^tn)) 

= f{tn, (XO, ■ • • , Xf„)) (t„, (Xo, . . . , Xt„))*‘->'"+‘ 



= /(<n,(Xo,...,Xt„)) 



as soon as 



{tn < n) or (t„ = n and (iq, . . . , x„) ^ i4„+i) 

where M„ is the Markov transition of X„ and An is the set-realization 
of the stopping time T (see Section 2.2.3, page 53). FVom the observation 
above, we find the fixed point equations 

7Qn+i=7 and = 

with the one-step transition jj„+i = $„+i(^„) associated with the updated 
Feynman-Kac flow. We can improve a little Proposition 12.2.1 when T is 
the entrance time of X' into a set of the form {B U C), with 5 fl C = 0, 
and / is the indicator test function 

/(T,X[o,r]) = 1 b(^t) 




434 12. Applications 



In this case, arguing as in the proof of Proposition 12.2.1, we find that 

7n(/) = 7(/) - 7(/n) With /„(T, Xfo,T]) = l(„,oc)(T) 1b{X!t) 

from which we conclude that 

0<7(/)-7n(/)< T>n) 

If T is not almost surely finite, the above analysis remains valid on the 
event (T < oo). More precisely. Proposition 12.2.1 holds true if we replace 
fiXr) and f{X„) by /(Xr)lr<oo, and /(X„)lr<oo, and P(n < T) by 
P(n <T < oo). 

Proposition 12.2.1 can be extended to bounded potential functions G'^ 
as soon as we have A = sup„ log ||G(,|| and E(e^^) < oo. In this case, we 
check that 



\%if) - 7(/)l < 2E(e^^lr>n) (— ^ 0 as n oo) 

The Feynman-Kac models in excursion space presented above provide a 
nice probabilistic interpretation of Dirichlet problems with boundary con- 
ditions. For instance, let be a time homogeneous Markov chain with 
Markov transitions M' on some measurable space {E',£'), and let G' be 
a (0, l]-valued potential function on E'. If we let T be the exit time of X'„ 
fi-om a measurable set A € £', then for each / G B{E') the functions 

DU){x) = Ejf{X^)l[G'{X;) 

\ p=i 



satisfy the pair equations 

f D{f){x) = f{x) if X e 

\ D{f){x) = M'{G'D{f)){x) if X e A 

If we interpret D as & bounded integral operator, with some obvious abusive 
notations we have 



( fiD = fi if /I € P(A') 

\ fiD = nQ'D if fi£ V{A) 

with the integral operator Q'(/) = M'{G'f). 

The particle approximation model associated with the Feynman-Kac 
model consists in stopped excursion-valued particles, that evolve and 
interact according to the potential function G„ introduced in (12.2). In 
the exit time case discussed above, the excursions are stopped as soon as 
they exit the set A. In this situation, they potential value is equal to 1. 
The other particles explore the state space, and interact with the whole 
configuration, in accordance with the absorption potential function G'. By 




12.2 Random Excursion Models 435 



construction, the algorithm stops as soon as all the particles have exited 
the set A. Also observe that for Gn = 1, the particle interpretation model 
reduces to N iid excursion-valued particles. 

We end this section with an important observation. By Proposition 12.2.1, 
the function D{f){x) can be approximated by the uimormalized Feynman- 
Kac flow 

D„(/)(x) = E, n Gp(p A T, 

with the potential function 

By the multiplicative formula presented in Proposition 2.3.1, we have 

P=1 



where r/n^ represents the prediction Feynman-Kac flow associated to the 
stopped process and the potential functions Gn, and starting at at time 
n = 0. One rather crude numerical approximation of Dn(f)(x) will be to 
start at each x 6 Aa separate particle model. One alternative and more ju- 
dicious strategy is to evolve a single particle approximation model, properly 
initialized in A, and use the descendant genealogical tree approximation 
models (11.15) described in Section 11.4. 

Let us make Ufe slightly more compUcated by considering only the excur- 
sions of a time-homogeneous Markov chain X' from a set into a particular 
subset B C A^, In other words, we are given a partition = B U C and 
the set C is regarded as an hard and absorbing obstacle. More formally, we 
suppose that X' starts in A and exits the set at a random time T. The law 
of the excursions ending in B are given by the Feynman-Kac measiues 

7(/) = E (/(T,Xfo,Ti)lB(A:')) = E ^/(T,Xfo,n) n 

If we let R be the first time X' hits G, then we have 

7(1) = f{T <R) = ?{X't^B) 

Vif) = W)/7(l) = E(/(T,X[o,ri)|r<i?) 

This formulation, combined with Proposition 12.2.1, allows us to estimate 
7 (/) by the Feynman-Kac flow associated to the stopped excursion-valued 
process and defined by 

n 

7n(/) = E(/(T A n, X[o,r^„,) 

P=1 




436 



12. Applications 



We readily observe that 

7„(1) = F{TAn<R) and Tj„{f) = E(/(T A n, Xp, j-An]) \T An <R) 

The particle interpretation model consists in stopped excursion- valued par- 
ticles. A particle that enters into C is killed, and instantly a randomly cho- 
sen excursion in AUB duplicates. Note that the excursions from A to £ are 
stopped for always (since their potentisd value is equal to 1). The particle 
model is stopped as soon as all the particles exit A, or are absorbed by C. 

When the obstacle set C is too large and too attractive, most of the 
excursions hit the set C. In this case the particle approximation model is 
not really efficient. Two strategies can be underlined. The first idea is to 
cheinge the reference measure so that the excursions become more likely 
to avoid the set C. To guide guide the reader, we let E and E be the 
expectations operator with respect to the law of a Markov chain X' with 
transitions and M„. Suppose that M4(x, .) < M„(x, .) for all x G E'. 
In this case, the Feynman-Kac measures 7 „ can be rewritten as follows 

7n(/) 



= E 










The particle interpretation model is defined as before except that the par- 
ticle explore the state space with the Markov transitions and they are 
updated using the Radon-Nikodym potential functions. The second idea 
is to introduce a judicious multilevel decomposition, and then freeze the 
particles as soon as they enter into a level from which the next excursion 
is more likely to enter in B, This more sophisticated strategy is described 
in detail from Section 12.2.3 to Section 12.2.5. In some instances, the vari- 
anace in the central limit theorem can be explicitely computed or at least 
estimated, and compared with more crude Monte ceurlo methods. The in- 
terested reader is referred to Section 12.2.7. 



12.2.3 Multilevel Feynman-Kac Formulae 

Let {Xt)tei be a strong Mwkov process taking values in some metric state 
space {S,d) with discrete or continuous time index / = R+ or 7 = N. 
For discrete time models we recall that the strong Markov property always 
holds (see for instance Section 2.2.3). We suppose X is defined on the 
canonictd filtered probability space (fi = £>(/, S),F = {Ft)teh (Pi)i6s) of 
left-continuous and right-limited paths D(I, S) from I into S (for 7 = N, 
note that D{I, S) = S^). For any distribution tjo G V{S), we write P,^ = 
Jg rjo{dx) Pi, the distribution of X with initial distribution %. 

We consider a nonempty measurable subset Ac S and we let T be the 
first time Xt exits from A. We further suppose that the complementary set 




12.2 Random Excursion Models 437 



is decomposed into two disjoint subsets = B U C. We let R be the 
first time X hits the set C. By definition, we have T < R with T < R as 
soon as X hits B before C. Finally, we assume that T is a finite stopping 
time in the sense that Pi(T < oo) = 1 for any x e S. In other words, if we 
let Td the entrance time into a measurable set D C 5, then we have 



T = Tbuc = TbATc, R = Tc and (T < R) {Tb < Tc) 

In addition, suppose that Tc is a finite stopping time, let Xf = XtATc 
the stopped process associated with Tc, and let be the first time Xf 
enters in B. Then, we clearly have the equivalent formulations 



(T<R)^ {Tb < Tc) 4=^ (Tg < 00 ) 



On the event (Tg < oo), we have T = Tb = Tg smd the random path 
(■^t)t€[o,T^l = (■^«)t€[o,T] represents the excursion from the origin up to 
the entrance time in B. In the opposite case (with the usual convention 
infe = oo), we have 7g = oo on the event Tc < Tb and (Jft)t6[o,T^] = 
(■^t)t€[o,Tc] excursion from the origin up to the entrance time in C. 

Note that even if T is bounded it may take arbitrary leurge values and the 
process may be trapped in A for an arbitrary long period. We are interested 
in solving numerically functional expectations of the form 



T{F) 

r(F) 



E 



Vo 



^F(T,XT)exp|- jf^V,(X,)ds| lr<fl 
^F(T,Xr) |n Gp(Xp)| 



if / = R+ 



if 7 = N 



(12.3) 



where F is a botmded measurable function on (/ x S) and V : (s,x) € 
(/ X 5) Vf{x) € R+, G : (s,x) € (I x S) G,(x) € [0, 1] is a pair of 
boimded potential functions. 

The empty set C = 0 is not excluded. In this situation, we use the 
convention that B = oo and the resulting models reduce to the class of 
excursion models discussed in section 12.2.2. In this section, the set C has 
to be thought of as a hard obstacle set the Markov particle tries to avoid. 

To fix these ideas, let us suppose that Xq starts in some particular region 
Aq C A with an initial distribution i/q. During its exclusion from Aq to 
i4‘^(= BuC), the process passes through a decreasing sequence of level sets 

{Bn)n=0,...,m G with 



B = B„,C...cBiCBo 

The splitting parameter m and the choice of the level sets (B„)o<n<m de- 
pend on the problem at hand, but it is important to choose these quantities 
such that 

Ao = Bo - Bi and Bo n C = 0 




438 12. Applications 



We refer the reader to Section 12.2.5 for some worked-out examples of 
splitting levels in the context of rare event analysis. To capture the behavior 
of X between the different levels (5n)o<n<m. we let T„, 1 < n < m, be the 
first time X hits U C; that is 

Tn — mf {0 ^ t : Xi G Bn U C} 

We associate with these entrance times the discrete stochastic sequence of 
excursions 

Xn = (T„, {Xt ; T„_1 <t<Tn))eE = x D{[s, t], S) (12.4) 

By construction, we also notice that the random sequence of level-crossing 
times is increasing: 



To = 0<Ti<...<Tm = T 



By a direct inspection, we see that if T„ < ij, then the second component of 
Xn represents the excursion of the process X between the successive levels 
Bn-i and B„ so that T„ can be alternatively defined by the inductive 
formulae 



T„ = inf {T„_i <t : XteBnDC} 

Under our assumptions, we also observe that these entrance times are finite 
and 

{T<R) = {Tm <R) = {Ti<R,...,T„,<R) 

By the strong Markov property, we prove the following result. 
Proposition 12.2.2 The stochastic sequence (A'„)o<n<m defined by 

Xn = (Tn, (Xt ; Tn-i < t < Tn)) € E = U,<t({t} X £)([s, t], 5)) 

forms a Markov chain taking values in the set of excursions E. 

In this interpretation, it is £dso important to note that the level indexes 
n € {0, . . . ,m} are regarded as the time indexes of the excursion Markov 
model. Whenever C is nonempty, it may happen that the excursion starting 
at some level, say Bn-i. visits C before entering into the next desired level 
Bn- In this case, we have 

Tn = R=^'ip> n Tp = R and Xp = XR^C 

In the opposite case, it may happen that a given excursion starting at some 
level, say Bp-i - Bp, 

~ i'^p’ I “Ep—i ^ t ^ 'Ep)) with Xxp_i € Bp— I 




12.2 Random Excursion Models 439 



enters “directly” into some level Xt, € Bn C Bp with n > p without 
visiting the set Bp - B„. In this case, recalling that C . . . C Bp+i C Bp, 
we have Xt^ = = ... = Xt^ e Bn, Tp = Tp+i = ... = Tn, and 

Xp = Xp^i = = Xn- In other words, in this case the process is frozen 

during (n - p) units of time. This apparently innocent observation is in 
fact one of the key ingredients of the corresponding particle interpretation 
models. Loosely speaking, the particle excursions that successfully enter 
into some level will also be frozen. If some of the others never, succeed they 
will be killed and instantly a randomly chosen “frozen particle” duplicates. 
One way to check whether or not a random path has succeeded in reaching 
the desired nth level is to consider the potential functions Qn on E defined 
for each tel and x = {xr)s<r<t € D{[s^ t], 5) with s < thy 

expj-^ V;(xr)dr| if / = R+ 

t 

iBn(xt) n if / = N 

p=s+l 

In this notation, we have for each n 

{Tn<R) = {Ti<R,...,Tn<R) = {Gl{Xi) = l ...,gn{Xn) = l) 
(Xo,...,Xn) 

= ((0, Xo), (Ti,(Xc; 0<t<Ti)),..., (T„, (X* ; T„-i < t < T„))) 

In the further development, we slightly abuse the notation and sometimes 
write [Xt ; 0 < t < Tn] instead of (Ab, . . . , Xn), the sequence of excursions 
of X between the levels Bq,. . , ,Bn- Using elementary calculations, we prove 
the following proposition. 

Proposition 12.2.3 (Multilevel Feynman-Kac models) For any n e 

N and fn € Bb{E^^^), we have 



Qn{t,x) = 
Gn{^,x) — 



E.„ Yl^piXp) 

\ P=1 ) 



= ^/n([^t ; 0 < f < T„|) 1t„<R exp {■f 



Vs{Xs)ds U if / = R+ 



= E.,(/n([Xt ; 0 < f < T„l) lr„<fl {rip:i Gp(Xp)}) 



if / = : 



The particle interpretations of these discrete generation Feynman-Kac mod- 
els should now be clear to the reader. For instance, the simplest particle 




440 



12. Applications 



interpretation in the context of continuous time models goes as follows. We 
start with N independent copies (xo)i<»<Af -^o and during the mutation 
stage we randomly evolve these particles up to the first time they hit the 
first-level set (Bi U C). If (Xi(0)o<t<7^ denotes the excursion of the ith 
particle firom Aq to {Bi U C), we compute the weights 

QiiTiMit))o<t<n) = lB,(xi-*)exp|- K(xiW)d« 

Note that if an excursion hits C, its weight is 0; otherwise the exponential 
weight represents the strength of the soft obstacles it has visited during 
its evolution to Bi. In this sense, we can interpret these weights as the 
predicted lifetime of each particle. Dining the selection transition, with 

a probability exp{- Jq ‘ V',(xl(s))da}, the excursion survives. Otherwise, 
the particle dies and instantly one of the excursions having succeeded in 
reaching Bi is randomly chosen with a probability proportional to its pre- 
dicted lifetime and splits into two identical copies. At the second step, we 
again evolve the selected excursions up to the first time they reach the set 
(B 2 UC), and we update the particle configuration in accordance with their 
predicted lifetime between levels B\ and Bj, and so on. 

We refer the reader to Sections 11.3 and 11.4 for a more thorough discus- 
sion on these particle algorithms. We can clearly combine these multisplit- 
ting strategies with the stopped excursion valued particle approximation 
models described in the end of section 12.2.2. Precise excursion particle 
models evolving in an absorbing medium with only hard obstacles will also 
be described in Section 12.2.5. 

12.2.4 Dirichlet Problems with Hard Boundary Conditions 

In this section, we illustrate the abstract Feymnan-Kac models presented 
in Section 12.2.3 in the context of Dirichlet problems with boundary con- 
ditions. We use the same notation and convention as there. In the homoge- 
neous and discrete time case, by the Markov property we easily check that 
the function 

Mx) = E^ ^/(Xr)nG(Xp) lr<«j =E, ^/(Xt)Ib(Xt) 

is a solution of the Dirichlet problem 

f M{Gh){x) = h{x) for x € A 
\ h(x) = /(x)1b(x) for X A 

where M(x, dy) represents the Markov transitions of the chain X„, Ac S 
is a given subset, and A*^ = {BU C) is a partition of the complementary 




12.2 Random Excursion Models 441 



set A'^. If we choose a constant potential functions, say G = 1, then the 
function 

h{x) = Ex{f{XT)lT<R) = Ex(/(A’7’)lXr€B) 

is a nonnegative harmonic function in A with the boundary values h = /1 b 
on A^^. 

To illustrate these models in continuous time settings, we let L be the 
second-order linear differential operator 

l<t<d l<»,j<d 

where a< are bounded Lipschitz functions on 5 = R** and 66* is symmetric 
and uniformly strictly positive definite (i.e., of full rank). The Dirichlet 
problem is now to find a function h that satisfies the equations 

f L{h){x) = V{x)h{x) for x€A 
^ h{x) = g{x) for x edA 

where .4 is a bounded and open set with smooth boimdary dA and (V,^) 
is a given pair of bounded functions. The probabilistic interpretation of 
this problem is as follows. First we observe that L is the infinitesimal gen- 
erator of the (i-dimension6d stochastic process Xt, t defined by the 
stochastic differential equation 

d 

dXi = ai{Xt)dt + Y,bi,j{Xt) dW} 
j=l 

where H7, 1 < t < d, are independent Wiener processes. Note that our 
regularity assumption on 6«j(x) ensures that Fx{T < oo) = 1 (in the op- 
posite case, the “diflFusion” may never succeed in reaching the boundary). 
In this situation, for any partition = (B U C), the probabilistic repre- 
sentation of the solution of the Dirichlet problem with p = /1 b is given by 
the Feynman-Kac formula 



h{x) = Ex ( /{Xt) exp 



<-r 



V{Xf)ds} 1t<r 



In some very special cases, explicit solutions exist. For instance, if we take 
(6,/) = {Id,l), (o,y) = (0,0), the open annulus A = = {x € 

R*^ : £i < |x| < £ 2 } with 0 < £i < £ 2 . C" = {kl > £ 2 }) and B = {|x| < £ 1 }, 
then we have Xt = Wt and for any x € i4 



< » “=2 



TT’-. 



^3- 



h(x) = Ex(1b(Wt)) = Px(WV < £ 1 ) = < 



if d>3 

(12.5) 




442 



12. Applications 




FIGURE 12.1. Brownian evolution on the annulus 



In the case where V’(x) = A > 0, and the constant function f{x) = 1, 
the function h{x) = Ex(e~^^lT<H) coincides with the Laplace transform 
of T on the event {T < R) = {Xt G B). In general, these functional 
Feynman-Kac models cannot be solved analytically and their numerical 
solution requires extensive calculations. Suppose for instance that the ob- 
stacle set C is too “large” in the sense that most of the realizations of Xt 
tend to end in C. In this case, we would need to sample a large number of 
independent copies of Xt to find at least one that reaches the desired target 
boundary B. This shows that a naive Monte Carlo method will need too 
many particles to get some reasonable statistical accuracy. As mentioned 
earlier, one advantage of the multisplitting technique comes firom the fact 
that “good” particles are frozen as soon as they succeed in reaching some 
rare level and these leading individuals duplicates into several offsprings. 
As a result, these decomposition levels behave as gateways from which the 
particles have more chance to hit the desired regions on the boundary. 
One drawback of these particle models is that their accuracy is intimately 
related to a judicious level decomposition of the state space. 

Example 12.2.1 (Absorption event) Suppose the state S = B D D is 
decomposed in two separate regions B and D. The process X evolves in 
the region D, which contains a collection of ^^soft^ and ^hard” obstacles 
represented respectively by the potential functions Gp or Vs and by a subset 
C C D. The particle is instantly killed as soon as it enters the ^%ard” 
obstacle set C. In this context ^ the quantities 



E|/q 



lT<il 



T 



n 

P=1 



and 1t<r exp 



i-i: 



Vs{X,)ds 



represent the probability of exiting the pocket of obstacles D without being 
killed. More generally, the measures defined in Proposition 12.2.3 represent 
the distribution of the path process on this event. The sequence (Sn)o<n<m 
represents the exit levels the process needs to reach to get out of D (before 
being killed). We notice that the pocket of obstacles D may be decomposed 




12.2 Random Excursion Models 



443 





^Obstacles ^ Stilted paitidcs 



N»7 particliK 



^Mfiiclcs 



FIGURE 12.3. Genealogical model, [exit of C(2) before killing] (N=4) 



into a collection of subpockets Dq C D\ C ... C Dm = D. Each Dp may 
intersect the set of obstacles C (see Figure 12.2). Let Bq = S - C and 
let Bp, p = 0 <p<m, be the decreasing sequence of subsets defined by 
Bp+i = 5 - (CU Dp). By direct inspection, we see that this decomposition 
satisfies the desired properties with the target set Bm+i = S - (C U Dm) = 
S - D = B and the initial set Aq = Bq - Bi = Do - DoC\C. Figure 12.2 
provides a simple example of decomposition of a medium vrith hard and soft 
obstacles. 

In Figure 12.3, we have illustrated the genealogical particle model asso- 
ciated with a killed Markov particle X evolving in a pocket of obstacles 
D C 5. In this example, the pocket D is decomposed into three regions. 
Do C Di C D 2 = D. The decreasing sequence of exit levels is given by 
Bp+i = 5 - (C U Dp) p = 0, 1, 2, with Bq = S - C, B = B 3 = S - D, 
Ao = Do - Do n C (and the desired target set here is B = B(3)). 




444 12. Applications 



12.2.5 Rare Event Analysis 

We use the same notation and conventions as were introduced in Sec- 
tion 12.2.3. For the convenience of the reader, we recall that (Xt)tg/ is 
a strong Markov process taking values in some metric state space (5, d) 
with discrete or continuous time index / = R+ = [0, oo) of N. The process 
X starts in some Borel set i4o C 5 with a given distribution i/q € P(5). We 
also consider a pair of Borel subsets {B, C) such that AqCxC = ^ = Bf\C. 
We associate with this pair the first time T the process hits (B U C) and 
the hitting time R of the set C. Note that 

T = inf{0 ^ t : Xt G B U C} ^ R — inf{t > 0 ; Xt € C} 

We also assume that (i4o, B, C) is chosen so that for any initial state x € i4o 
we have lP*(r < oo) = 1. As we shall see later, the Feynman-Kac splitting 
models developed in this section are still valid if we replace this condition 
by the weaker assumption that Pi(T < oo) > 0 for any x € Ao. Neverthe- 
less, since the branching models are built on excursion particles between 
successive levels, the first condition ensures that any of these excursions is 
finite. One would like to estimate the quantities 

P(T<R) = P(Xt€B) 

Law(Xt ;0<f<T|T<i?) = Law(A't ; 0 < t < T j Xt e B) 

( 12 . 6 ) 

It often happens that most of the realizations of X never reach the target set 
B but are “attracted” and “absorbed” by some non empty set C, These rare 
events are difficult to analyze munerically. One strategy to estimate these 
events is to consider the sequence of level-crossing excursions Xn associated 
with a splitting of the state space and defined in (12.4) on page 438, namely 

Xn = (T„, {Xt ; r„_i <t<Tn))eE = U,<t({t} X D{[s,t],S)) 

with the entrance times T„ = inf {0 < t : Xt € B„ U C). Following the 
arguments used in Section 12.2.3 to check whether or not an excursion suc- 
ceeds in entering into the desired nth level, we consider the potential func- 
tions Qn on E defined for each t € / and x = (xr)»<r<t 6 i^([s,t],S) with 
s < t by Qn{ti ®) = lB„(a^t)- Rephrasing proposition 12.2.3 on page 439, we 
obtain the following Feynman-Kac representation of the desired quantities 
( 12 . 6 ). 

Proposition 12.2.4 For any n and for any fn G Bb{E’''^^), we have that 




12.2 Random Excursion Models 445 



Notice that if /„ is defined for any Sn < tn and Xn G D{[sn^tn]) by 

/n((^0? ^o)j • • • ) (fn> ^n)) = • • • » ^n) 

for some (pn G then we find that 

E(/„([Xt ■,0<t<Tn])\T„<R) = E(vJ„(To, . . . ,T„) | T„ < /?) 

We recall that the prediction Feynman-Kac model r]n € V{E) is defined by 

f?n(/) = 7n(/)/7n(l) with 7n(/) = E ^/(A’„) JJ 



and corresponds to the conditional distributions 

Vnif) = E(/(T„, {Xt ; T „_1 < t < T„)) | T„_i < R) 

FWthermore, it satisfies the measure-valued dynamical system 

T/n+l = ^n+l(j?n) with T}o = Sq ® Uq (12.7) 

The mappings $„+i from Vn{E) = {rj : f]{Gn) > 0} into V{E) are defined 
by 

^n+\iv) = '^niv)Mn+l = / ’5'„(7?)(du) Mn+l{% .) 

Je 

The Markov kernels Mn{u,dv) represent the Markov transitions of the 
chain of excursions Xn- The updating mappings are defined from Vn{E) 

into Pn{E) and for any t) G Vn{E) and / 6 Bb{E) by the formula '9n{‘n){f ) 

= VifGn)/v{Qny 

Lemma 12.2.1 For any n > 0, lye have 

n 

P(Tn < R) = 7n(l) = 7n+l(l) = R VpiGp) 

p=0 

In addition, we have P(T„ < R \ T„_i < R) = r}n{Qn) and for any f € 
Bb{E) 

Vnif) = E(/(T„,(Xt;T„_i<t<T„))|T„_i<il) 

Uf) = ’I'n(»/n)(/) = E(/(T„,(Xt;r„_i<t<T„))|r„<fl) 

This lemma gives a Feynman-Kac interpretation of the probability of rare 
events. Since the potentials are indicator functions, it is more judicious to 
rewrite the Boltzmann-Gibbs transformations in terms of 

the selection Markov transitions 5n,tj(u, dv) on E defined by 

Sn,T,{u,dv) = (1 - lg-i{ij(u)) ^niv){dv) + Su{dv) 




446 



12. Applications 



Note that represents the collection of excursions in 5 entering the 

nth level B„; that is 

*{1} = {« = (f. («r)*<r<t) € ({<} X D((s,t])) , s << and Ut G B„} 

In this notation, the equation (12.7) can be rewritten as follows 

^n+l ~ Vn^n+l,ri„ with ~ ^n,ijn-^n+l (12-8) 

To motivate this section, we describe hereafter some more or less academic 
but especially instructive situations. 

Example 12.2.2 (Ballistic event) Suppose the state S = AUC is de- 
composed into two separate regions A and C. The process X starts in A, 
and we want to estimate the probability of the entrance time into a target 
B C A before exiting A. In this context, the conditional distribution (12.6) 
represents the law of the process in this ”ballistic” regime. 

Example 12.2.3 (An elementary gambler’s ruin process) We con- 
sider a simple random walk Xn = x + £» on E = Z, starting at some 

X € Z where (ei)j>i is a sequence of independent and identically distributed 
random variables with common law 

P(ei = +1) = p and P(ei = -1) = q 

vrith p,q€ (0, 1) and p + g = 1. If we use the convention = 0, then we 
can interpret X„ as the amount of money won or lost by a player starting 
with X € Z euros in a gambling game where he/she wins and loses 1 euro 
with respective probabilities p and q. If we let a < x < b be two fixed param- 
eters, one interesting question is to compute the probability that the player 
will succeed in winning b — x euros, never losing more than x — a euros. 
More formally, this question becomes that of computing the probability that 
the chain X„ (starting at some x € (a, b)) reaches the set B = [b, oo) before 
entering into the set C = (— oo,a]. Whenp < q (i.e., p < l/2j, the random 
walk Xn tends to move to the left, and it becomes less and less likely that 
Xn will succeed in reaching the desired level B. Following Exercise 12.2.9, 
we check that 

P,(fi<oo) = l and (12.9) 

Example 12.2.4 (Birth and death model) We consider a simple ran- 
dom walk Xn on E = N where the state 0 u an absorbing barrier and the 
elementary transition probabilities are defined for any x>0 by 

V{Xn = X + l|Xn-l = x) = p(x) P(Xn = X - l|X„_i = x) = q(x) 

where for x > 0 we have p{x),q{x) 6 (0, 1) and p(x) + q{x) = 1 and the 
absorbing condition P(X„ = 0|Xn-i = 0) = 1. We let P* be the distribu- 
tion of the Markov chain Xn starting at Xo = x at time n = 0. We can 




12.2 Random Excursion Models 447 



interpret X„ as the dynamical population model. Given a population size 
Xn = X, we have a birth X„+i = + 1 with a probability p(x); otherwise 

an individual dies as Xn+i = - 1 with a probability q(x). In tiiis context, 

one typical question is to evaluate the probability that the population size 
reaches some upper level b <oo before extinction. More formally, this con- 
sists in evaluating the probability that Xn (starting at some Xq = x >0) 
hits the level B = (6,oo) before hitting the absorbing barrier C = {0}. If 
E„>o{nLi = 00 , then in Exercise 12.2.11 we will see that, for any 
X € N, Fx(R < oo) = 1. 

12.2.6 Asymptotic Particle Analysis of Rare Events 

The ./V-particle model associated with a given collection of transitions K.n,r, 
is described in Section 3.2. The precise description of the particle motion is 
a little involved, mainly because the state space is the set of excursions and 
the potentials are indicator functions. As a result, when all the particles 
miss the potential support, the algorithm is stopped and the system goes 
into some cemetery state. Because of its importance in practice, we provide 
next a detailed presentation. In the context of rare event, the particle model 
consists in evolving a collection of N excursion-valued particles 

C = (r„_,,(a(t);rn-i<t<r„))€Eu{A} 

The auxiliary point A stan^ cemetery or coflSn point. The random 
time pairs and represent the length of the correspond- 

ing excursions. At the time n = 0, the initial system consists of N indepen- 
dent and identically distributed 5-valued random variables (q = (0, Co) with 
common law r)o = Sq^j/q. Since we have ^o(0, u) = 1, there is no updating 
transition at time n = 0 and we set Co = (0> Co) 1 < i < iV. As an 

aside, if we use the convention Tlj = Tli = 0, and if we set TJ = 7^ = 0, 
then these initial variables (CqjCo) rewritten in the excursion form 

a = (0,C5(0)) = (73,(aW:Tii<f<73)) 

Mutation: The mutation stage -> at time (n -I- 1) is defined 
as follows. If $„ = A, we set $„+i = A. Otherwise, during mutation, each 
selected exclusion 

evolves randomly and independently of each other according to the Markov 
transition MnA-i of the chain X^- Iii other words 

a+i = (n+i,(c+iW;7;i<«<7;+i)) 




448 12. Applications 



is a random variable with distribution A4n+i(^n> •)• More precisely, we set 
T^ = T^, and the particle Q{i) at time t = evolves rtmdomly as a copy 
(Cn+i(*))«>T,i of excursion process {Xf)g>T* starting at Xt^ = C(^’n) 
at time s = and up to the first time it visits Bn+i or returns to C. 
The stopping time represents the first time t > 7^ the tth excursion 
hits the set U C. 

Selection: The selection mechanisms ^„+i are defined as follows. 

In the mutation stage, we have sampled N excursions 

Some of these particles have succeeded in reaching the desired set 
and the other ones have entered into C. We denote by 



/^(n + l) = {t:C‘+i(7;i+i)e5„+i} 



the labels of the particles having reached the (n + l)st level, and we set 
»Ti(Cn+i) = Two situations may occur. If I^{n + 1) = 0, 

then none of the particles have succeeded in hitting the desired level. In 
this case, we have m(^„+i) ^ Vn+i{E). Therefore the algorithm has to be 
stopped and we set = A. Otherwise, the selection transition is defined 
as follows. Each particle 

Cn+i = in+iAc+lit) ;n<t< 7 ^+,)) 

is sampled according to the selection distribution 

5n,m($„+ 1 ) (^n+1 > 

More precisely, if the tth excursion has reached the desired level (i.e., 
. ^ 

Q+ii1l+i) e Bn+i), then we set In the opposite case we 

have Cn+i(^+i) En+i when the particle has not reached the (n + l)st 
level but it has visited the set C. In this case, is chosen randomly and 
uniformly in the set 

{fiti ; s Sn+i) - (C, ; j e /"(" + 1)} 

of excursions having entered into Bn+i- In other words, each particle that 
doesn’t enter into the (n + l)st level is killed, and instantly a different 
particle in the Bn+i level splits into two offsprings. For each time n < 
= inf {n > 0 ; 'il <i< N , Cn(^) ^ ^}> Af-particle approxima- 




12.2 Random Excursion Models 449 



tion measures (7n ) associated with {'yn^VniVn) are defined by 



7„^ = 



n— 1 






Lp=o 



X r,^ with 



t=l 



Vn-'^niVn) ~ Card(7^(„)) S ^ '^(n.({;(t);T*_,<t<T-)) 

We also notice that 

=f:'(i) = T^iSn) = n n 

p=0 p=l 



In other words, 7^(1) is the proportion product of excursions having en- 
tered levels Bi,..., Bn. Also notice that ^n is the occupation measure of 
the excursions entering the nth level. The corresponding genealogical tree 
model is defined in the same way by tracking back in time the whole an- 
cestor line of current individuals. Here the path-particles at time n take 
values in E„ = and can be written as 

Xn = X*n = (fo.n- • • • - C,„) € 

with, for each 0 < p < n. 



£* 

Sp,n 

Sp,n 



(75,n.(§,nW; 



The updated particle density profiles associated with this genealogical tree- 
based algorithm are defined by 



" Card(/^(n)) 



»€/'^(n) 






The asymptotic analysis of these particle measures is discussed in Chap- 
ters 7 to 10. For instance, rephrasing Theorem 7.4.1 and Proposition 7.4.1 
we check that 



Theorem 12.2.1 For any n > 0 and N >1 we have 
P(t^ < n) < a(n) exp {-N/b{n)) 

The particle estimates are unbiased, E( 7 ^(l)l„<,.w) = P(T„ < R), and for 
any P > 1 and n>0 we have 

y/N E(|7^'(1)1„<... - P(T„ < i?)|P)i < a(p)6(n) 

In addition, for any /„ € Bb(En), with ||/„|| < 1 we have that 

y/i^{\ulf{fn)ln<r'^ ~ E(/n([X, , 0 < p < T„]) | T„ < i?)|P)i < a(p)b(n) 

for some finite constants a(p),b(n) < oo whose values only depend respec- 
tively on the parameters p and n. 




450 12. Applications 



To visualize these splitting particle models, we end this section with two 
ballistic recursion events. In both cases, the state 5 = AuC is decomposed 
into two disjoint Borel sets A and C. The target set B is a subset of A. The 
decreasing sequence (Bn)o<n<m represents the physical levels the process 
X needs to enter before reaching the desired tmget B. 

Example 12.2.5 When S = is the Euclidean space, we can think of a 
sequence of centered decreasing balls with radius l/(n + 1) 

J3„=B(0,l/(n + l))cR‘' and C = 5-B(0,l + e) 

for some e G. (0, 1). Further assume that the process X exits the ball of 
radius (1 + e) in finite time. In this example, P(T < R) is the probability 
that X hits the smallest ball 

Bm = B{0,l/{rn + 2)) 

starting xoith 1/2 < |Xo| < 1 and before exiting the ball of radius (1 + e). 
The distribution (12.6) represents the conditional distribution of the pro- 
cess X in this ballistic regime. 



Example 12.2.6 When S = R.^., we can choose for instance the intervals 

■Bn = [n + l,oo) and C=[0, 1-e] 

For instance, if we consider a birth and death process or a queueing network 
processing jobs, the level represents the population size or the number of 
jobs in the queue. In this case, P(T < R) = P{Xt > m) can be interpreted 
as the probabUity that the population model or the queue length reaches 
some critical rare levels. 

In Figmre 12.4, we illustrate the genealogical particle model for a particle 
X evolving in a set A C 5 with recurrent subset C = 5 - A. To reach the 
desired target set B 4 , the process needs to pass the sequence of levels 

B0DB1DB2DB3D B4 



12.2.7 Fluctuation Results and Some Comparisons 

In this short section, we briefly explain why the splitting particle method- 
ology often increases the numerical efficiency of Monte Carlo methods. By 
way of comparison, we consider the naive method based on N independent 
copies XI of the Markov process Xt. We also let T* be the entramce time of 
X* into the set B U C. The corresponding unbiased estimator of P(T < R) 




12.2 Random Excursion Models 451 




— ^ uputklm aikllcd piftkki 



FIGURE 12.4. Genealogical model, [ballistic regime, target B(4)] (N=A) 

is simply given by N~^ By the traditional central limit 

theorem for independent random variables, the random sequence 

- P(T„ < R)) 

converges in law (as N -> oo) to a centered Gaussian random variable 
such that 

E(W^") = =def. P(T„ < R){1 - P(T„ < R.)) 

The study of the fluctuation of the particle splitting particle approximation 
models is discussed in Chapter 9. For instance, using Proposition 9.4.1, and 
Remark 9.4.1 we have the following theorem 

Theorem 12.2.2 For anyO <n < m+l, the sequence of random variables 

~ P(T„ < R)) 

converges in law (as N tends to oo) to a Gaussian random variable Wn+i 
with mean 0 and variance 

n+1 

~ y^7p(l)^ ^p-l(-^p,t)p-i(Qp,n+l(l) ~ ^p,t?,-iQp,n+l(l))^) 

p=0 

( 12 . 10 ) 

The collection of functions Qp,„+i(l) on the excursion space E = U*<t({t}x 
£)([s,f],5)) are defined for any x = (x(w)),<u<t € Z)([s,t],5) and s <t by 



Qp,n+i(l)(t,x) = lBp(a:(t)) P(T„ <R\Tp = t, Xt,= x{t)) 

Explicit calculations of <r„ are in general difficult to obtain since they rely on 
an explicit knowledge of the semigroup Qp,n- Next we provide an alternative 




452 12. Applications 



formulation related to absorption times and rare event analysis. For any 
p < g < n, we set 

A^,,(t,x) = P(Tn <R\T, = t,Xr,= x)/P(T„ < | Tp < i?) 

After some elementary computations (see also page 305), we first prove 
that the variance (7„ associated with the McKean transition given in 
(12.8) takes the form 



<rl = P(Tn < R)^ (On - 5„) 



with 

. n+l 

On = TTw ’?p((Qp,n+l(l) “ *7pQp,n+l(l))^) 

.. n+l 

bn — 77To ^~l7p(f)^ *?p-l(^p-l(-^pQp,n+l(f) ~ *?pQp,n+l(l))^) 

7n+UlJ 

Then we observe that 7p(l) = P(Tp_i < R) and 



VpQp,n+l{i) = 7n+l(l)/7p(l) = P(7’n < R I ^p-i < R) 



from which we conclude that 

n+l 

On = 5^E([A”_,,p(Tp,XT,)lTp<fi - 1]* I Tp_, < R) 
p=0 

In much the same way, we find 

6n = X^E(lT,<fl(Ap,p(rp,XT,)-l]2|Tp_i<i2) 

p=0 

= f^P(Tp < R I Tp_i < R) E([A"p(Tp,Xt,) - 1]^ | Tp < ii) 

p=0 

Loosely speaking, these two formulations indicate that the renormalized 
variance <r^/F{Tn < R)^ is the sum of the “locd variances" induced by 
the particle-splitting transition at each stage of the algorithm. When these 
local errors are uniformly bounded, then we have 

<c(n+l) P{T„<Rf 

for some finite constant c < oo whose values do not depend on the splitting 
parameter n. We illustrate this result for the case of the simple random walk 
on S = Z and described in Example 12.2.3 (see also Exercise 12.2.9). 




12.2 Random Excursion Models 453 



To simplify the presentation, let us suppose that the chain Xn starts at the 
origin Xq = 0 and the thresholds are given by 

^ = (- 00 , 0 ) and Bn = [^> 00 ) n = l,2, ... 

In this situation, for any 0 < rii < ri 2 , we have from (12.9) 

P(T„, <R\Tr,,<R) = P(Xt„, = I = n,) = I j 

from which we prove that frn = 0 and 
an = l + Er=l{nTr>R\Tr-l<R) 



+P(Tr < R\Tr 










= 1 + Hq/p) - 1) Eri* l-My 

< 1 + ((9/p) - l)(n + 1)^:^ = 1 + (n + l)(q/p) < (n + l)/p 
These calculations imply that 

al<P{Tn<Rnn + l)/p and ?{T„ < R) = 

and after some manipulations, recalling that p < q, one finds that for any 
n > 1 



7.2 /_2 
an/a„ 



^ p l-P(Tn<fl) 

- n + 1 P(T„ < R) 

^ P (( 9 /p)" - l]/(« + 1)^00 as n -^00 



Related comparisons between the limiting variances for interacting and 
non-interacting methods in the context of filtering methods can be found 
in [88]. 



12.2.8 Exercises 

Exercise 12.2.5: Let Wt be a d-dimensional Wiener process with d > 2 
and let = {x € £i < |x| < £ 2 } with 0 < £i < £ 2 . We consider the 
partition U Ce^ with Be, = {|x| < £ 1 } and Cej = {|x| > £ 2 }. 

• Using formula (12.5) and recalling that for any x with 0 < £1 < jxj < 
£2 we have 

Pi(Wt hits 0 before Cej) = lime,_,.oPi(Wt hits Be, before Cej) 
check that Wt never hits points. 




454 12. Applications 



• In much the same way, prove that for any 0 < £i < |x| 



hits Sei 




1 



for 

for 



d = 2 
d>3 



• Using the Msurkov property, show that for d > 3 



Px(W^t hits Bej after time t) 

= / * (-* 0 as ( -t 00 ) 



• Conclude that Wt is recurrent in two dimensions and it wanders off 
to 00 in dimension d > 3. 



Exercise 12.2.6: Let 5 = A U be a partition of the state space S. We 
consider a Markov chain X„ starting at A and we denote by T the first 
time the chain exits the set A; that is, T = inf {n > 0 : X„ G A*^}. We 
further assiune that there exists some m > 1 such that Sm(A,A^) =def. 
infxgj4 Pi(-^m € A*^) > 0. 

• Show that for any x € A we have Px(T > m) < (1 - Sm(A, A^)) and 
conclude that for any n > 1 

Px(T > nm) = Pi(T > nm | T > (n - l)m) Px(T > (n - l)m) 
< (l-<i,n(A,A‘=))” 

• Check that for any n = pm + g with p > 0 and 0 < g < m we have 

Px(T > n) < Px(T > pm) < (1 - A^)rHl ~ ^m(A, A'))"/"* 
and conclude that for any x G A we have Px(T < oo) = 1. 



Exercise 12.2.7: Let (£n)n>i be a collection of independent and identi- 
cally distributed random variables on the lattice 5 = Z** with common law 
^ I]|e|<i <5e- Let A C Z** be a finite set. We consider the simple random 
walk X„ = X„-i + e„ starting at some point Xq = x € A. Check that 

m > sup |x| Sm(A,A'^) = inf Pi(X,„ G A‘) > (l/2d)”* 

xeA xeA 

where |x| stands for the minimal length of an admissible path joining 0 to 
X. Using Exercise 12.2.6, show that Px(T < oo) = 1, where T represents 
the entrance time of the chain into A*^. 




12.2 Random Excursion Models 455 



Exercise 12.2.8: Let (£n)n>i be a collection of independent and iden- 
tically distributed random variables with common law /x on 5 = R. We 
further assume that E(ei) = 0 and c = > 0 and we let A = [a, 6] 

be a finite interval. We consider the Markov chain X„ = X„_i starting 
at some point Xq = x G A. 






Show that for any m > 1 we have E((53^i £pf) = and deduce 
that 

/ m \ 



Vm > (|a| -l-6)/c 






>0 



• Check that for any y/m > (|a| -I- b)/c we have inf^gyi P*(Xm € A®) > 
0, and using Exercise 12.2.6 show that P*(T < oo) = 1, where T 
represents the entrance time of the chain into A*^. Prove that the same 
result holds true if we replace the noimegative variance condition 
by the fact that E(|ei|) < oo and P(ei = 0) < 1 (see for instance 
Proposition 7.2.3 in [278]). 

• Prove that if E(£i) < 0, then for any i € R we have Pi(i? < oo) = 1, 
where R denotes the entrance time of the set C = (-oo, a). 



Exercise 12.2.9: [Gambler’s ruin] We consider the simple random walk 
model Xn starting at some x € [a, 6] and described in Example 12.2.3. We 
further assume that q > p. We introduce the function 

X G [a, oo) — > a(x) = Px(fi < oo) with R = inf {n >0 : X„ = a} 
as well as the first time the chain X„ reaches one of the boundaries 
T = inf {n > 0 : X„ G {o,fe}} (< R) 

• Check that if we have jx — y| > n or (j/ — x) ^ n -f 2k, for some k > 1 
then ¥x{^n = y) = 0. The case where (y - x) = k - {n- k), with 
0 < k < n, corresponds to situations where the chain has moved k 
steps to the right and (n - k) to the left. Prove that Px(X„ = y) = 

• Show that a is the minimal solution of the equation defined for any 
X > a by a(x) = pa{x + 1) + qa{x - 1) with the boimdary condition 
a(a) = 1. 

• Whenever p < g, we recall that the genered solution of the equation 
above has the form a(x) = A+B{qfpY with a(a) = 1 = A+B{q!pY 
so that a(x) = 1 + B{(g/p)* - (g/p)“}- Deduce from the above that 
Px(i2 < oo) = 1 for any x. 




456 



12. Applications 



• Check that for 6my n > 0 8uid A > 0 we have 

Px(fi > ») = P*(^n >a)< (pe^+^e"^)" 

• If we choose A = 5 log (q/p) € (0, 00), then prove that 

Px(fi>n)<(p/g)(*-“)/"(4pgr/2 



• Deduce from the above that for p ^ 1/2 

E.(T) < Ex(i?) = ^ ^ 

• Show that for any a < x < 6 the stochastic process M„ = {qlp)^" 
is a P*-martingale with respect to the filtration F„ = tr(Xo, . . . , X„) 
and if p < g, then Px-a.s. on the event {T > n} we have that 

Ex(|M„+i - Mnl I F„) < 2{q/pf (q - p) 



• Since we have Ei(T) < 00 and E*(|M„+i - Mn\ \ Fn)lT>n < c for 
some finite constant by a well-known martingale theorem of Doob 
(see for instance Theorem 2 on p. 486 in [287]), prove that Ex(Mr) = 
Ei(Mo) = {q/pY, and deduce that for any x € [a, 6] 

(9/p)* = («/p)‘Px(T <R)^ (9/p)“(l - P*(T < R)) 



Finally conclude that for any p^qvK have 



Px(T <R) = 



{q/pY - {q/pY 
{q/pf - {q/pY 



( 12 . 11 ) 



• Using the strong Markov property, check that for any p, q the function 
/?(x) = Pi(T <R) = Ej(lt(X7')) satisfies the equation 

0{x) = p0(x -I- 1) + q0{x - 1) 

for any x € (a, 6), with the boundary conditions {0{a),0{b)) = (0, 1). 
For p^q, check that the function (12.11) is the unique solution, and 
for p = g = 1/2 prove that the solution is given for any x € [a, 6] by 

?^{T<R) = ix-a)/{b-a) 



Exercise 12.2.10: Let Wt be a standard Wiener process on R. A Brownian 
motion Wt — mt with negative drift (i.e., m > 0) can be defined as the 




12.2 Random Excursion Models 457 



limit of the simple random walk defined (by a slight abuse of notation) by 
= where e„ is a sequence of properly scaled iid Bernoulli 

random variables P(e„ = 1) = 1 - P(£n = -1) = P = 5(1 - m-\/At). If we 
let At 0 , then check that E(Xt) = -m At[t/(Af)] -> —mt and 

E((A-t - E(Xt))2) = At[t/(At)](l - m^At) t 

Using Exercise 12.2.9 and the fact that 

((1 -p)/p)i/'^ = (1 + mv/At)i/^(l - mN/At)-i/'^ 

check that for any x € [ 0 , 6 ] we have Pi(T < R) = > where T is 

the first time the process hits one of the boundaries {o, b} and R the first 
time it hits a. 



Exercise 12.2.11: [Birth and death model] We consider the simple random 
walk model Xn on E = N described in Example 12.2.4. 

• In the homogeneous situation (p{x),q{x)) = (p, q) from Exercise 12.2.9 
check that for any x e N we have a(x) = Pi(fi < 00 ) = 1 with the 
time of absorption i? = inf{n > 0 : X„ = 0}. 

• More generally, prove that the function a is the minimal solution of 
the equation a(x) = p(x)a(x + 1) + q{x)a{x - 1) with the boundary 
condition a(0) = 1. 

• Using the fact that 



p(x){a(x + 1 ) — Q!(x)} + q(x){o(x - 1 ) — Q!(x)} = 0 
if we set Aa(x) = a(x) - a(x - 1 ), then prove that 

• Deduce from the above that 

a(x) = 1 + 2 Aofe) = 1 - 2 1 n g] I (1 - 0(1)) 



• Two situations may occur. First check that if 5!^y>o{nz=i p^} ~ 
then we have a(x) = 1 for any x G N. In the opposite situation, we 



can choose any a(l) such that 0 < (1 -a(l))Ey>i{nLi ^ 




458 12. Applications 



If we take a(l) = 1 - Ey>i{Ilz=i ^ ^hie 

minimal solution a(x) is given by 



»w=C(n^>i/iDni>i 

y>x 2=1^' ^ y >0 2=1^^ ^ 



9(«)i 



Exercise 12.2.12: Let X^, (B, C) and Bn be a Markov chain, the pair 
target/obstacle sets and the multilevel decomposition described in the in- 
troduction and in Section 12.2.3. We fix a finite time horizon h and we 
set 



y;, = (n,x;)€K = ({"}x®') 

B„ = {|0,/i)kB„) and C = {{)i)xB') 

Note that the final time horizon h is regarded as an hard obstacle set the 
process tries to avoid during its excursions between the level sets Bn- We 
let G„ be the potential functions on 

= with =<u/, x...xf?;) 

defined for any p < q and j/ = (t,Xt)p<t<, € by 

G„(y) = 1b„(9,x,) = l(o,h)(g) 1 b„(x,) 

We consider the excursion-valued Markov chain 

r„ = [Yl ; Tn-i <t<Tn)eE 

where = Ib„uc is the entrance time of X' into the set B„ U C. Check 
that 

n 

EuMYo,...,Yn) n^pW) 

P=1 



~ 1 0 ^ ^ ^ ^nl) 1x1 eB„, T„<h) 



with 



[(t,X'); 0<t<Tn] 

= ((0, XI,), {{t, X') ; 0 < t < Tj), . . . , {{t, XI) ; Tn-i < t < T„)) 




12.3 Change of Reference Measures 459 



12.3 Change of Reference Measures 

12.3.1 Introduction 

In Section 2.4.2, we have seen that a given distribution on path space 
may have different Feynman-Kac interpretations. The objective of this sec- 
tion is to extend these ideas to excursion models tmd better connect these 
interpretations with importance sampling methods. In Section 12.3.3, we 
illustrate these changes of measme in the sequential analysis of probability 
ratio tests. In Section 12.3.4, we apply the multislipping particle method- 
ology to estimate the accuracy of these statistical tests. To formalize these 
changes of measure suppose we are given a collection of Markov transi- 
tions M„ from En-i into En such that for any x„_i € En-i we have 
M„(x„_i, .) « M„(x„_i, .). Also suppose that t}q is a given distribu- 
tion on Eq such that tjo << *^e corresponding Radon-Nikodym 

derivatives are bounded positive functions. If we set 

P„(d(Xo,...,X„)) = fjQ{dXo)Mi{Xo,dXi)...Mn{Xn-l,dXn) 



then we find that P„ << Pn and 



(Xo,...,Xn)~ II (Xpj 



dP„ 






with the conventions Mo{x-i,dxo) = r^(dxo) and Mo(x_i,dxo) = ^o(^o) 
for p = 0. By lonescu-Tulcea’s extension theorem (see for instance Theo- 
rem 2 on p. 249 in [287]), the dist£ibutions (Pn,Pn) can be extended into 
a unique pair of distributions (P, P) on the canonical space of infinite se- 
quences (fi = Un>0 = (^n)n>0, (-^n)n>o)- K we denote by E(.) and 
E(.) the corresponding expectation operators, then we find that 



E(/n(X[o,n])) 



E 



/n(-^[0,n]) 



n 



n 



dMp{Xp-.ly ») / Y X j 

dMp{Xp^^,.y "7 



for any bounded measurable function fn on -E[o,n] = Ilp=o 
X[o,n] =def. (^p)o<p<n for every n > 0. Note that the formulae displayed 
above can be interpreted as a Feynman-Kac path measure with potential 
functions given by the Radon-Nikodym derivatives. This elementary ob- 
servation shows that one can generate path samples of a Markov process 
with transitions M^, using a genealogical tree model associated with Mn- 
exploration particles with updating mechanisms dictated by the Radon- 
Nikodym potential functions. More generally, we have the following propo- 
sition. 




460 12. Applications 



Proposition 12.3.1 Let T be a stopping time. Then, for any bounded 
measurable function f on the excursion space U„>o({n} x £^[o,n])> have 



E(/r(%Ti)lT<oo) = E^/r(Xio,T]) 



Proofi 

Since P(T < oo) = 1, the proof is an immediate consequence of the follow- 
ing equalities 



E(/t(.X’[o,ti)1t<oo) 



X)E(/n(X[o,nl)lT=n) 

n>0 






n>0 



By the definition of Pn, we notice that the Feynman-Kac path measure 
introduced in (11.7) can alternatively be rewritten as 

Qn(^(^0j • • • >^n)) = I JJ Gq(Xg_i,Xq) I Pn(d(Xo, . . . , Xn)) (12.12) 

U=0 J 



with Go(x-i,xo) = Go(xo)^(xo) and for any g > 1 



— ^qi^q) 



dMg(Xg--i, .) . 

dM,(x,_i,.) 



This observation, together with Proposition 12.3.1, yields the following 
Feynman-Kac model in excursion space. 

Proposition 12.3.2 Let Gn be a sequence of [0, l]-valued potential func- 
tions on some measurable spaces (En.Sn)^ For any stopping time T and 
any bounded measurable function f on Un>o({^} x ^[o,n]); have 



E(/r(X(o,Ti)np=o^?p(^p) 1t<oo) 

= E(/t(X[o,T]) rip=oG,(Xp_l.Xp)lT<oo) 



12.3.2 Importance Sampling 

The changes of reference measure described in Section 12.3.1 provide a 
new tuning parameter in the numerical solution a particular Fe}mman- 
Kac model. These modeling strategies also offer significant benefits. For 




12.3 Change of Reference Measures 461 



instance, let us suppose that the functions G„ are not bounded and/or 
not strictly positive but the new potentials G„ = Mn+i{Gn) satisfy these 
properties. In this case, it is more judicious to use the particle model asso- 
ciated with a change of reference measiure (see Section 2.4.3). Now suppose 
that a direct simulation of the chain X„ is inefficient. This happens for 
instance when a precise simulation technique of is not known, when the 
oscillations of the potential functions are too large or when the chain is too 
slow to achieve a given rare event. One strategy to solve this problem is to 
use a different exploration distribution (see for instance Exercice 12.3.8). 
This can be performed using the change of reference probability measmes 
presented earlier. For instance, the change of measure presented in (11.17) 
corresponds^o the peuticular situation where M„ are chosen to be the 
transitions Af„ defined in (11.16). In this case, the particle mutations are 
related to the potential function, and the resulting particle is more refined 
in regions with high potential. 

I^Vom the statistical point of view, formula (12.12) can be also interpreted 
as an importance sampling pltm. Importance sampling is a classical tech- 
nique commonly used in statistics to speed up simulations and increase 
the efficiency of a given Monte Carlo method. The common idea here is 
to modify the Feynman-Kac representation formula replacing the ref- 
erence Markov chain distribution with a new one in order to facilitate the 
simulations and/or to increase the accuracy of the adaptive stochastic grid. 
In importance s€unpling literature, the new distribution is also called the 
“twisted distribution”, and the corresponding change of measure deriva- 
tive is referred to as the “likelihood rati^or “the weighted function”. In 
this context, the choice of the pair (Gn, A/„) is often called the sampling 
plan, and the strategy is to choose a judicious “twisted distribution” to 
increase the probability of occurrence of the desired events. This statistical 
interpretation leads immediately to the question of the optimal change of 
probability measure. The smswer depends on the choice of some criteria 
with respect to all possible changes of reference measures. If we use the 
variance criterion associated with the central limit theorem developed in 
Chapter 9, we are led to solve a nonhomogeneous variational problem in 
distribution space. To our knowledge, no satisfactory general criteria and 
solutions have been proposed in the literature on this subject. Rather, we 
believe that the choice of the “twisted distribution” strongly depends on 
the problem at hand and on the events of interest. Loosely speaking, the 
counterpart of increasing the probability of occunence of some events is to 
decrease the chances of the opposite. For more details, we refer the reader 
to the simple but illustrative Exercise 11.9.6 as well as the worked-out 
examples provided at the end of this section. 




462 



12. Applications 



12.3.3 Sequential Analysis of Probability Ratio Tests 

Suppose we are given a pair of Markov transitions (Mn, M„) from some 
measurable space {E,€) into itself. Also let Py and Py be^e distributions 
of a Markov chain with elementary transitions M„ and M„ and starting 
at y £ E. We suppose that these distributions are defined on a common 
canonical space {U,!F = (.Fn)n>o» (^n)n>o) and we denote by Py,„ and 
Py,„ their restrictions on We fisher assume that for any y e E 

the probability measures M„(y, .) and Mn(y, •) are absolutely continuous 
with each other. Under this condition the probability measures Py and 
Py are locally absolutely continuous in the sense that for any n > 0 and 
(j/p)o<p<n € (with yo = y) we have 



dPy.n 
dP y,n 



{yo,---,yn) = n 



P=1 



dMp{yp-i , .) . . 

dMpiyp-u.y^^’ 



One important question is to check whether or not the distributions Py 
and Py are absolutely continuous. This problem arises in various situations 
such as in sequential analysis and particularly in the study of statistical 
hypothesis probability ratio tests. We refer the interested reader to any 
classical textbook on statistics (see for instance the monograph of D. Sieg- 
mund [289]). In this context, the answer to the question above is in general 
negative and Py and Py are orthogonal (or singular) in the sense that there 
exists some A e 7^00 = (r{l^n>o7^n) such that Py(A) = PyCA"") = 1. In the 
further development of this section, we shall work under this assumption. 
We refer the reader to the set of exercises provided at the end of this sec- 
tion. Under this condition, it is readily checked that imder Py the stochastic 
sequence 

Zn{Y) = ^(Yo,...,Yn) 

l/,n 



is a m^ingaie. Since Ey(Z„(y)) — 1, the limit limn-^oo •^n(^) = ■^oo(T^) 
exists Py-a.e. and we have 



Py( lim Z„(y) = oo) = 1 = Py( Urn Zn{Y) = 0) 



To prove the first equality assertion, we combine the fact that Py and Py 
are orthogonal with the formula 

Py(A) = 

which is valid for any A € !Foo (see for instance Theorem 1, p. 525 in [287]). 
To prove the second equality in the display above, we use the fact that 
Py,n « Py.n and by S 3 ^mmetry arguments Py(lim„_,.oo Zn{Y) = oo) = 1 
with the Py-martingale Zn(Y) = Z~^(Y). 



j Zoo(y)dPy+Py(An(Zoo(y) = Oo)) 




12.3 Change of Reference Measures 463 



The simple hypothesis testing problem is defined as follows. Suppose we 
have a sample from a Markov chain Y = (l^)n>o (starting &t Yq = y 
at time n = 0) and we want to know whether F is a random sample 
from Py (hypothesis {H)) or a sample from Py (hypoth^is (H)). From 
previous considerations, if one of the hypotheses (H) or (H) is true, then 
we respectively have that 

lim Zn{Y) = 00 or lim Zn(Y) = 0 

n— ►oo n—¥oo 

Thus, one natural way to test one of these two hypotheses is to choose a 
pair of parameters 0 < o < 1 < 6 and wait until the first time T the random 
sequence Zn{Y) exits the domain (a, 6), namely 

T = inf{n>0 : Z„(y)^(a,6)} (12.13) 

Then we accept or reject (H) if we have respectively Zt(Y) > 6 or Zt{Y) < 
a. The region (o, b) is called the critical region. Note that in both situations 
T is a bounded stopping time so that the algorithm above always terminates 
in finite time. To get one step further, we introduce the random time 

i? = inf{n>0 : Zn{Y)<a}{>T) 

One rather simple way to measure the accuracy of the test is to estimate 
the probability of rejecting {H) if (H) is true. These probabilities are the 
so-csJled type I errors (or size of the test). They are defined formally by 
the formula 

fy{ZT{Y)>b) = fy(T<R) 

= EyiZf^Y) lzr(Y)>b)(< b-%(ZT(Y) > b)) 

(12.14) 

T 3 q>e II errors are defined as the probability of accepting (H) when it is 
false or equivalently of rejecting (H) when it is true: 

PyiMY) < «) = PyiMy) lzr(n<a)(< « PyiZHY) < a)) (12.15) 

We mention that the probability Py{ZT{Y) > a) of rejecting {H) when it 
is false is called the power of the test. Also note that in view of (12.14) it 
is tempting to take b very large but recalling (12.13), and in doing so we 
increase the probability (12.15). Thus we can only compare two tests with 
the same power (or with the same size). 

12.3.4 ^ Multisplitting Particle Approach 

Note that the expression (12.14) is a decreasing function of b amd it has 
the same form as the one introduced in (12.6). As we already mentioned, 
these probabilities can rarely be solved expUcitly. They rather belong to 




464 12. Applications 



the class of rare event estimation problems discussed in Section 12.2.5 and 
can be estimated using the splitting methodology. In the present situa- 
tion, the target and obstacle sets are respectively given by B = [5, oo) and 
C = (- 00 , a]. The underlying Markov process is the multiplicative chain 
(Xn, Yn) = {Zn{Y)i Yn) defined by the recursion 

Xn = X„_i X e5 = R+ with Xo = l (12.16) 

dM„(Yn-i, .) 



where is a Markov chain with elementary transitions Mn and starting at 
>0 = y € (o, 6). We will of cotmse not repeat the integral description of the 
particle algorithm associated with this model. To give a brief illustration, 
let us assume for simplicity that 6 is an integer. In this case we can choose 
to split the particles whenever they hit the intervals B„ = [n -I- 1, oo), 
1 < n < 6. To sample the first mutation step, we a^ple N independent 
excursions (V^*)o<p<r‘ elementary transitions Mp (starting at y) up 
to the first time they "hit (Bi U C) 



7^ = inf 



p>o : n 



dM,{Y^-i,.) 
l\dM,{Y<_,,.) 



(y;)€(BiUC) 



1 



Each excursion ending in C is killed, and instantly a particle randomly 
chosen among those that reach Bi spUts into two ofEsprings. After this up- 
dating stage, we evolve independently the particles in B\ up to the first 
time the corresponding probability ratio hits (B 2 UC). Then, we update this 
new configuration as before. We kill the excursions ending in C and dupli- 
cate the ones in B 2 , and so on. Formula (12.14) also suggests two additional 
particle interpretations. If we define the potential function G{x,y) = x/y 
on (0, 00 )^, then by the definition of X„ we obtain the equations 

P,(T<B) = P„(Zf*(y) 1 zhv)>6) = Ei (nG(Xp-i.^p) 1 t<r 

\p=i 

(12.17) 

where Ei(.) represents the expectation operator with respect to the law 
of the Markov chain defined in (12.16) under Py. The second expression 
can be approximated using the excursion particle models developed in Sec- 
tion 12.2. Since under Py the Markov chain X„ tends to infinity, the explo- 
ration excursions are more likely to reach the upper levels. Nevertheless, 
the updating mechanism will favor exciursions that have not entered into 
these levels too fast. The first expression in (12.17) is related to the more 
traditional importance sampling Monte Carlo method. The corresponding 
estimator is defined by jf ^t^(^‘) lz.ri(y*)>b. where {Y')i<i<N is a 
sequence of N independent copies of the chain Y under Py. The variance 
of this estimate is proportional to 

EpiZ^HY) lzriY)>b) - fy{T < R)^ < b-^ Py(T < ft) - Py(T < fl)2 




12.3 Change of Reference Measures 465 



This clearly shows the improvements obtained in comparison with the 
rather crude Monte Carlo estimate ^ ^Zj.i{Y*)>b based on N inde- 

pendent copies of the chain Y under Py. 

12.3.5 Exercises 

Exercise 12.3.3: [Importance sampling] Suppose we wemt to evaluate the 
integral fi{G) of a nonnegative and bounded potential function G with re- 
spect to some distribution fi on some measurable space {E, S). We associate 
with a sequence of independent random variables (A’‘)i>i with common 
distribution fi the empirical measmes ^ 

• Check that E(/i^(G)) = n{G) and 

- «(G))") = <’.(G) =d,t. M(G - MG))“) 

• For any probability measure ]i such that prove that fi{G) = 

]i{G) with G = G We let Ji^ = -^ ^ be the occupation 

measure associated with a sequence of N independent random vari- 
ables {X )j>i with conunon distribution Ji. Prove that E(p^(G)) = 
/x(G) and 

AfE((ji''(5)-M(G))") = »f(G) = drfi!((5-MG))“) 

= <’.(G)-»*(g=(i-^)) 

• Roughly speaking, from the equation above, we see that a reduction 
of variance is obtained as soon as is chosen such that ^ < 1 on 
regions where G is more likely to take large values. In other words, it is 
judicious to choose a new reference distribution Ji so that the sampled 
particles X are more likely to visit regions with high potential. For 
instance, if G = 1 a is the indicator function of some measurable set 
Ae £, then prove that 

MS) = <’„(G)-/‘(ld {'-%)) 

If we choose Ji such that %{x) < 1 - (5 for any x & A, then check 
that 



n{A) > fi{A)/{l - S) and <tjt(G) -I- Sfi(A) < <r^(G) 

• Show that the optimal distribution Ji is the Boltzmann-Gibbs m^ure 
Ji{dx) = ’9{n){dx) = fi{G)~^G{x)n{dx) in the sense that <Tjr(G) = 0. 
This optimal strategy is clearly hopeless since the normalizing con- 
stant /i(G) is precisely the constant we want to estimate! 




466 12. Applications 



• As a bad choice example, if we take /i = ^(G ^(x)/i(dx), then 

check that a-f[{G) > fi{G*)/n{G^) — /i(G)^ > ff^(G). 



Exercise 12.3.4: [Simple rtindom walk] Let (£n)n>o be independent and 
identically distributed random variable with common law P(e„ = 1) = 1 - 
P(g = —1) = p € (0, 1). We consider the simple random walk X„ on E = Z 
defined by X„ = Suppose we want to evaluate (using a Monte 

Carlo scheme) the probability that X„ enters into a subset A C N - {0}. 
If we have p < 1/2, then the random walk tends to move to the left. 
One natural way to increase the probability that the random walk visits 
the set A is to change p by some p € (p, 1). In this case, the random walk 
Xn defined as X„ by replacing p by p is more likely to move to the right, 
and as a result the event {Xn € A) is more likely than (X„ € A). The 
expected value of /(X„) = l/i(A'„) and the particle approximation mean 
using the standard Monte Carlo method are given respectively by 

1 ^ 

E(/(X„))=P(X„€A) and m(X„)(/) = - U(X’ ) 

i=l 

where (A'* )»>! is a collection of independent copies of Xn. 

• We let Pn be the distribution of the random sequence (s:p)o<p<n ^ 
{— 1, Check that 

P„(d(tio,...,u„)) = (p(l-p))^(p/(l-p))^^*=““'‘ 

• We let Pn be the distribution of the random sequence (Sp)o<p<n 
defined as (£p)o<p<n by replacing p by some p e (0, 1). Deduce from 
the first question that Pn « Pn and 

with G„(x) = «(p,p)^u(p,p) 

k=o J 

and(<.(p,p),r(p.p)) = (gJ;g,|tg). 

• Check that E{f{Xn)) = E{f(Xn)Gn(Xn)) for any / G 

• Let ^ collection of independent copies of Xn- By the 

centred limit theorem, prove that the sequence of random variables 

W„^(/) = VN{m{Xn){f)-E{f{Xn))) 

Wnif) = ^^N{m(Xn)ifnGn)-Eif{Xn))) 



JP * * * ’ 

dr n 



M|H 




12.3 Change of Reference Measures 467 



converges in law, as Af ->• oo, to a pair of Gaussian random variables 
with mean 0 and respective variance and ffnif) defined by 

crlif) = E(/(X„)2)-^/(X„))2 
alif) = E(/(X„)2G„(X„)2)-E(/(X„))2 
= <t2(/) + E(/(X„)2(G„(X„)-1)) 

• Prove that, for any indicator functions f = \a with A C (G„ < 
l/o„), for some o„ > 1 we have 

alif) < a-^ P(X„ G A) - P(X„ G Af < aHf) 



Exercise 12.3.5: Let {en)n>o aiid (^n)n>o be two collections of in- 

dependent and identically distributed exponential random variable with 
respective intensity parameter A, A > 0. Use the same line of reason- 
ing as in the previous exercise to evaluate the occurrence of the event 

^n = Ep=0^n^^C(0,Oo). 

Exercise 12.3.6: [Importance sampling and large deviations] Let m{X) = 
Eili be the empirical measure associated with a sequence of N inde- 

pendent and identically distributed random variables X = {X*)i<i<s with 
common law fA (on some measiurable space {E,S)). We consider a nonneg- 
ative and bounded test function V on E. Suppose we want to evaluate the 
quantity p{x) = F{m{X){V) > x) for some x > p{V) = E(m(X)(y)). We 
consider N' independent copies {Xj)i<j<s' of the N-dimensional random 

vector X and we set Sn' = 7^ ^[x,oo){‘^{Xj){V)). Check that Sisff is 

an unbiased estimate in the sense that E{Sn>) = p{x) and for any iV' > 1 
we have 



N'E{[Sn> - Pp(m(X)(V) > x)]2) = p(x)(l - p(x)) (12.18) 

• Prove that p(x) < = sup;^>o(Ax - log/i(e^^)). 

. Check that the supremum in the display above is attained at some 
Av,i € (0, 00) that satisfies the equation 

x = %,v{fi){V) with = 



• Let m{X) = ^ be the empirical measure associated with a 

sequence of N independent and identically distributed random vari- 
ables X = (X )i<i<N with common law (/i). Prove that p(x) = 




468 



12. Applications 



consider iV' independent copies {Xj)\<j<N' of the TV-dimensional 
random vector X and we set 

__ 1 _ 

= Jp'Z^iXj) l[x,oo)(m(X,)(F)) 

J=1 

Prove that E(Sn') = P(”t(-X’)(V) > x) and for any iV' > 1 we have 
Af'E([5^/-P^(m(A•)(y)>x)p) 

= E(l[,,oo)(m(A-)(V)) Z{X)) - p(x)2 < p{x) - p(x)] 

Compare this growth rate with the one obtained without changing 
the reference sampling measme (12.18). 



Exercise 12.3.7: Let be a Markov chain taking values in some mea- 
surable spaces {En,Sn), with elementary transitions M„, and initial dis- 
tribution 7/0- We consider a collection of Markov transitions M„ satisfying 
the regularity conditions stated in Section 12.3.1. We introduce the pair 
potentials/transitions (G„,M„) on the transition spaces E„ = (E„, E„+i) 
defined by 



M„((x,y),d(x',j/')) 

G„(x,p) 



^„(dx') M„+i(x',dp') 
dMn+i{x , .) ■ ■ 



Describe the Feynman-Kac model associated with the ptur (G„, M„) with 
initial distribution r)o x Mi. Show that the corresponding genealogical ap- 
proximation model can be interpreted as a particle simulation method of 
the Markov chain X„. 



Exercise 12.3.8: Let X„ be a random walk on Z starting at the origin 
with elementary transitions 

M„(x,dt/) = p„(x) Sx+i(dy) + qn(x) Sx-i(dy) 

where Pn(x) and g„(x) are (0, l)-valued numbers such that Pn(®) +9n(a:) = 
1. We let M„ be the Markov transition defined as M„ by replacing the pair 
(Pn(a:),9n(a:)) by another pair of (0, l)-valued parameters (p„(x),g„(x)) 
such that p„(x) -I- 5n(®) = Prove that for any x,y e Z with {y - x) G 
{-1,-|-1} we have 

^T^iv) = Pn(a:)/p„(x) l*+i(p)-l-9„(x)/g„(x) lx_i(p) 

dM fi[x , .) 

= (pn{x)/p„{x)) ’ (gn(a:)/g„(x))^^ 




12.4 Spectral Analysis of Feynman-Kac-Schrodinger Semigroups 469 



Deduce that for any /„ 6 we have 



E(/„(X[o,„])) = E 



fn{X[0,n])Y[Gk{Xk-l,Xk) 



k=l 



with the potential functions 




In the above display, AXk = (Xk - Xk-i) and E is the expectation w.r.t. 
the law of the random walk starting at the origin and evolving with the 
Markov transitions M„. Describe a genealogical particle simulation method 
of the random walk with transitions using the potential functions G„ 
and the mutation transitions M„. If we choose p„ = = p„, then 

prove that 

Gk(Xk-i,Xk) = (pn(Xk-i)/qn(Xk-i)f'''‘ >l<^AXk = -l 

In this situation, deduce that the particles are more likely to move to the 
right, *md the selection transition tends to favor the particles moving to 
the left. 



12.4 Spectral Analysis of 

Feynman-Kac-Schrodinger Semigroups 

In this section, we apply the particle methodology for the munerical so- 
lution of the Lyapimov exponent of Feynman-Kac and Schrodinger type 
semigroups on some classical Banach spaces. In some situations, these im- 
portant spectral quantities also coincide with the principal eigenvalues of 
positive operators. 

Except in some particular situations such as for the well-known harmonic 
oscillator or for the one dimensional neutron model discussed page 149 (see 
also Theorem 10.1 and the example on pp. 67-68 in [176]), explicit and 
analytic descriptions of these exponents are generally not available and 
we need to resort to some kind of approximation. Several strategies have 
been suggested in the literature. To name a few, the perturbation theory 
proposes in some instances asymptotic expansions of isolated eigenvalues 
(see for instance Kato [200]). Several characterizations of Lyapunov expo- 
nents have been suggested in mathematics literature in the beginning of 




470 12. Applications 



1950s, including the work of H. Wieland [312], Krein and Rutman [208], 
Birkhoff [32] and Harris [176]. Donsker and Varadhan have also presented 
in a series of papers (see for instance [117]) a theory of large deviations that 
expresses the well-known Raleigh-Ritz representation of the top eigenvalue 
in terms of a variational problem in distribution space. In some particular 
situations, this global optimization problem can be solved by using for in- 
stance some kind of stochastic global search algorithm or specific Hilbert 
projection techniques. 

Our approach consists in expressing these spectral exponents and the 
corresponding eigenfunctions in terms of the fixed point of a nonlinear 
Feynman-Kac distribution flow. These key functional representations bring 
some new light on coimections between the spectral theory of Schrodinger 
operators and interacting measure-valued processes. They also provide a 
natmral microscopic particle interpretation of these spectral quantities. Fur- 
thermore, the uniform convergence analysis derived in Section 7.4.3, lays 
solid theoretical foundations for the asymptotic and the long time behav- 
ior of these particle approximation models. This approach, which has been 
presented in [102] for continuous and discrete time models, was influenced 
by the recent works of Burdzy, Holyst, Ingerman, and March [186, 38], He- 
therington [179], Sznitman [295], and earlier joint work of the author with 
Guionnet [86] and Doucet [82]. 

12.41 Lyapunov Exponents and Spectral Radii 

We consider a time-homogeneous Markov chain 

(n = E^,T= {J^n)n>0,X = (X„)„>0, (Px)x€b) 

taking values in a measmrable space {E, £) with Markov transitions M. We 
also let G be a measurable potential function on E that satisfies the regular- 
ity condition (G) stated on page 115, and we set r = suPi.y {G(i)/G(y)} < 
00 . We associate with the pmr (G, M) the integral operator on the Banach 
space {Bb{E), ||.||) defined by 

Q{x,dy) = G(x) M{x,dy) (12.19) 

Note that |Q| = supj|^H_x ||Q(/)|| < ||G||. The semigroup Q” on Bb{E) 
associated with Q is defined by the formula Q" = with = Id. 

For any / G Bb{E), we observe that Q”(/) is also given by the Feynman- 
Kac formula 

Q”(/)(x)=eJ/(X„) HG(Xp) 

V p=0 

Since the potential is assumed to be bounded, one finds that is a col- 
lection of bounded operators on Bb{E) with norm 

IQ”I = sup ||Q’*(/)|| = ||Q"(1)|| (< ||G|n 
11 / 11=1 




12.4 Spectral Analysis of Peynman-Kac-Schrodinger Semigroups 471 



The Lyapunov exponent or the spectral radius Lyap(Q) € [0, +oo] of the 
semigroup Q on the Banach space Bb{E) is the quantity defined by subad- 
ditive argiunents as 

Lyap(Q) = Spr(Q) = lim = inf (12.20) 



12.4-!^ Feynman-Kac Asymptotic Models 

Next we propose a way to relate the logarithmic exponents 

\{G) = logLyap(Q) = lim - sup logQ”(l)(x) 

n-*oo n xgE 

with the long time average of the Feynman-Kac distribution fiow model t]n 
defined for any / € Bb{E) by t;„(/) = 7„(/)/7n(l) with 

7n(/) = rjoQ"(/) = ^/(X„) n 

where po is & given probability measure on E and E,^(.) represents the 
expectation operator with respect to the distribution P.K, = 

We recall that the distribution flow satisfies the time-homogeneous and 
nonlinear equations 



T]„ = $(7„-i) = 4'(7 „_i)M 

with the Boltzmann-Gibbs transformation : ViE) -*■ V{E) defined for 
any 7 € V{E) and / € Bb{E) by 

mu)^p{Gf)KG) 

We denote by n > 0, the corresponding nonlinear evolution semigroup 
defined by 

= and $° = /d 

The choice of the initial distribution may vary. When tjd = sometimes 
we write 7 ^*^ and the corresponding measures. In this notation, we 
have 

7 <"^(/) = W(/) = Q’‘(/)(x) and ,,(*)(/) = = 

On the other hand, we observe that 



7n(l) — 7n-l(G^) — T]n-l{G) 7n-l(l) 



and 



n — 1 

7n(l) = n 



p=0 




472 12. Applications 



This yields that 

=def. ^ logQ”(l)(x) = i log7i")(l) = i 

p=0 

We note that the logarithmic exponents A(G) can be rewritten as 
A(G) = lim supAjf^(G) 

^-^^xeE 

Our next objective is to express the Lyapunov exponent Lyap(Q) in terms 
of the fixed point t/oo of the semigroup The existence and the possibil- 
ity of approximating such a representation will depend on the asymptotic 
stability properties of These properties are discussed in some detail in 
Section 4.3 and 4.4. In the present section we simplify the presentation and 
we use the following contraction condition: 

($) For any € ViE) and n > 0 we have 

||$"(Ail)-$”(/i2)||tv< /9($’*)||Ml-/^2||tv With ;8($) = 53/3($")<00 

n>0 

This contraction property is difficult to check in practice. Several sufficient 
conditions in terms of the pair (G, M) are derived in Chapter 4. For in- 
stance, suppose the Markov kernel M satisfies the mixing condition (M)^ 
stated in page 139 for some m > 1 and some e{M) = e G (0, 1]. Then, 
using Proposition 4.3.5, we find that 

< 2e-^ r”* (l - 

In addition, we have the following estimate, which does not depend on the 
potential function 

By the Banach fixed point theorem the contraction condition ($) implies 
the existence of a unique fixed point tjoo € ViE). The interplay between 
the latter and the quantities (G, M) is described by the fixed point formula 

Vooif) = Hvoo){f) = r?oo(GM(/))/7,oo(G) 

If we take / = Q^(l), we readily find the recursive formula 

VooiQ^^Hl)) = »?oo(Q”(l)) »?oo(Q(l)) = Voomi)) VooiG) 

Thus we have »/oo(Q”(l)) = (»?oo(G))”, firom which we conclude that 




12.4 Spectral Analysis of Feynman-Kac-Schrodinger Semigroups 473 



Using the inequality | logx - log j/j < |x - j/|/(x A y), which is valid for any 
X, y > 0, we also find the estimates 

|Ai*)(G) - i 2 I log»?^>(G) - logf,^)(G)| 

p=0 



p=0 



and in much the same way 

|A(f)(G) - logqoo(G)l < I rm (12.21) 

n 

We summarize the discussion above with the following proposition. 

Proposition 12.4.1 When condition ($) holds true, then the logarithmic 
Lyapunov exponent is given by the formulae 

1 

A(G) = logqoo(G)= lim Ai*)(G) with A(*)(G) = - logql*)(G) 

n— ►oo Tl * 

p=0 

In addition, we have the uniform estimates 

sup|k*)-»?oo||tv<^(^’*) and n sup|AW(G)-A(G)| < 2r|9($) 
xeE x€E 



12.4 3 Particle Lyapunov Exponents 

The choice of a particle interpretation of the Feynman-Kac semigroups in- 
troduced in Section 12.4.2 is not unique. We refer the reader to Section 2.5.3 
as well as to Chapter 11 for a thorough discussion on different McKean 
interpretation models. To fix the ideas, we consider here the N particle 
approximation models introduced in (11.11), and we assume that the po- 
tential function G satisfies condition (G) for some e(G) = 1/r € (0, 1] with 
1 < r < 00 and the mutation transition M satisfies the mixing condition 
(M)m for some m > 1 and some e{M) = e >0. 

We recall that during the mutation stage the particles evolve randomly 
according to the Markov transitions M. Dining the selection, each of these 
particles randomly selects an individual according to the distribution 



^*»»(4n)(^n> •) 



g(e) 

vf=iG(Ci) 



hi, + (1 " 



vjtiG(^) 



) ^(m(^„)) 



(12.22) 



In other words, with a probability , it remains in the same site, 

and in the opposite event it jinnps to a newly selected site randomly chosen 




474 12. Applications 



with distribution ’Sf(m(^„)). For (0, l]-valued potentials G, the regions of 
the state space where G < 1 can be interpreted as soft obstacles. When 
the particles evolve in these regions, their lifetime is decreasing and they 
try to escape by selecting individuals with higher potential value. In the 
opposite case, the particles evolving in state-space regions where G = 1 are 
not affected by the selection pressure and evolve randomly according to M 
elementary tramsitions. 

We consider hereafter the Af-particle approximation model of the flow 
and we denote by the iV-particle approximation measures = 
^ Under condition {M)m, we have proved in theorem 7.4.4 the 

following uniform estimate: For any / e Bb{E), ||/|| < 1, and N >l 

'/N sup E - »?n H/)l*’) < o(p) c(m, e) 

*€E, n>0 ^ ! 



for some finite constants c(m, e) < oo whose values depend on the triplet 
(€,m,G). If we apply this result to / = G/||G||, we find that 

sup E(l7,W*)(G)-T,W(G)|»')'^’’<a(p)c(m,e) ||G|| 

n>0 ^ ' 

from which we conclude that 

sup E (|AW*)(G) - AW(G)r)'^'’ < o(p) c(m,e) ||G1| 
with the iV-particle approximation of the Lyapimov exponent 

<"••>(0) = ii; iogir''(G)=iE iogiEc(4) 

p=0 p=0 i=l 



If we combine this uniform estimate with (12.21), we obtain the following. 

Theorem 12.4.1 Suppose condition {M)m is met for some m > 1 and 
some e > 0. Then, for any N >1, we have 



sup \/]VE(|Ai^-*)(G)-A(G)|'’) 

x€E, n>y/N ^ ' 



1/p 



< OO 



and for any p > 1 and f € Bb{E) with ||/|| < 1 

sup y/N E (|t/W*)(/) - VooifWy'"’ < a{p) c{m,e) 

xeE, n>jJjIogN ' ' 



for some finite constant c{m,e) < oo that depends on the triplet {e,m,G). 




12.4 Spectral Analysis of Feynman-Kac-Schrodinger Semigroups 475 



12.4-4 Hard, Soft and Repulsive Obstacles 

In an earlier section, we assumed that the potential function G cannot take 
null values, thus excluding some interesting physical situations. Suppose 
that G is a [0, l]-valued function, and let E = G“^((0, 1]). In this situation, 
the logarithmic exponent of the semigroup Q{f) = GM{f) is given by 



A(G) = lim - sup log El 




€ [- 00 , 0 ] 



(12.23) 



A trivial example to illustrate this situation is to choose the indicator 
function G = 1 a of some measiurable set A e f . In this case, we clearly 
have for any / € Bb{E) 



^xeE = A 



n— 1 



Ex /(X„)JlG(Xp) 

\ p=0 



= E*(/(X„) lr>n) 



where T is the exit time of A; that is, T = inf {n > 0 : ^ A}. Two 

interpretations can be underlined. In the first one, the set A*^ = G“^(0) is 
regarded as a hard obstacle and T as the killing time of the particle when 
it hits A. The second dual interpretation is to interpret the set A as a trap 
where the particle spends some time before visiting A^. Loosely speaking, 
in this particular situation, we have in some sense and for large values of n 

supPx(T>n)~e"" 

xeA 

The larger is A(G)(G [— c», 0]), the smaller is the strength of the obstar 
cle or the larger is the trapping effect of A, We further assume that 
the triplet (tjq, G, M) satisfies the accessibility condition (w4) introduced in 
(2.16); that is, for any x G JB, we have M(x, £?) > 0 and tjq{E) > 0. The 
main simplification due to this condition is that the prediction and updated 
Feynman-Kac models 



Vnif) = 7n(/)/7n(l) and finif) = 4'(»?n)(/) = 7n(/G')/7n(G') 

are well-defined for any time n > 0 (for more details, we refer the reader 
to Section 2.5). Therefore we also have the asymptotic Feynman-Kac in- 
terpretation 

1 

A(G) = lim sup - y'logTj^®^(G) (12.24) 

The particle approximation model of the Feynman-Kac prediction flow rjn^ 
described in Section 12.4.3 is defined in the same way, but it m^ happen 
that at a given time all the configurations exit the ^t E and the 
algorithm is stopped. Also note that if a given particle ^ E, then during 




476 



12. Applications 



the selection stage it jumps to a new selected indipdual in .E. In a birth 
and death interpretation, a particle evolving in is killed and instantly 
a new, randomly elected individual in E splits into two offsprings. In this 
sense, the set E-E can be interpreted as a hard obstacle set. We again refer 
the reader to Section 2.5 for more details on these particle evolution models 
in absorbing media. In Chapter 3, we have developed severed strategies 
to estimate the probability of the events {t^ < n} and the asymptotic 
analysis of the corresponding particle approximation models. 

Our next objective is to design an alternative particle interpretation. 
The key idea is to turn the hard obstacle set into a repulsive obstacle. To 
describe this particle algorithm, we first observe that for any x e E we 
have that ^ ^ 

M{x,dy) G{y) = G{x) M{x,dy) 

with 

G(x) = M(G)(x) and M(x,dy) = 

For instance, in the case where G = Ig is the indicator function of some 
measurable subset E, we have that 



G{x)=^M{x,E) and 



M(x, E) 



FVom this observation, we readily check that 



7«{/G) = 






n— 1 



fiXn) n Gp(Xp) = Vo{G) /(X„) n Gp(Xp) 

p=0 / V p=0 



) 



where jjb = '^{Vo) and E^(0 represents the expectation with respect to 
the law of ^Markov chain Xn with initial distribution f)o amd Markov 
transitions M. By (12.23), we finally conclude that 



A(G) = Urn 



1 



n-^oon+l^gg 



sup log E, 




(12.25) 



This asymptotic Feynman-Kac interpretation of A(G) is defined as in (12.23) 
by replacing (G, M) by (G, M). Thus, the corresponding particle approx- 
imation model is defined as the one described in Section 12.4.3, bvd the 
particle explores the state space E with the elementary transitions M and 
the selection transition is defined as in (12.22) by replacing the potential 
function G by G. Note that in the former model the particle will nev^visit 
the hard obstacle set E - £?. In this sense, when replacing Af by A/, we 
turn the hard obstacle set into a soft obstacle set. From these observations, 
we see that the whole asymptotic analj^is presented in Section 12.4.3 is 
still valid if we replace (G,Af) by {G,M). Finally we refer the reader to 
the end of Section 4.4 for examples of mutation transitions M satisfying 
the mixing condition {M)m- 




12.4 Spectral Analysis of Feynman-Kac-Schrodinger Semigroups 477 



12.4 5 Related Spectral Quantities 

In this section, we discuss the interplay between the Lyapimov exponent 
and some other related spectral quantities. In what follows, Q is a given 
bounded operator on Bh{E) such that / > 0 Q(/) > 0. This condition 

is clearly satisfied for the operator Q{f) = G M{f). Suppose there exists 
a probability p on {E, £) and a constant c > 0 such that the following 
inequalities are satisfied for any / € Bb{E): 

KlQirn < cpi\f\) ( 12 . 26 ) 

Under this condition, the image Q{f) of a function / € Bt{E) negligible 
with respect to p remains negligible. Thus Q is a well-defined operator on 
Looifi)- I^Vom the upper bound above, Q can be extended in a unique way 
as a bounded operator on Li(/x). We also observe that for any / G Bb{E) 
we have 

m/f) < KQ{f)QW) < IIQ(i)IIMQ(f )) 

so that Q is also a well-defined operator on L2 (m)- F^om this observation, 
we are led to consider the corresponding notion of spectral radius, 

Spr,^(e) := 

where 



ie"iL » “p p|(0(/)fl//i|/"i 

/el..(»)\{0) 

If JE is finite and /x gives positive weight to any of its points, then the 
equivalence of norms on finite-dimensional space (in this case the algebra 
of E X E matrices) enables us to see that Spr(Q) = Spr 2 ^^(Q), but this 
equality is not always satisfied. Even when E is finite, it is easy to construct 
an example for which iQI > IQh,, with a probability /x not charging the 
whole set E (what is always true in this finite context is that |Q|oo m = 
Nevertheless, under a symmetry assumption, we have the following 

result. 

Lemma 12.4.1 If Q is self-adjoint on L 2 (//), then we have Spr(Q) > 

Spr2,^(Q). 

Proof: 

Let a function / € and an integer n > 1 be given. Using the symmetry 
of Q”, we obtain 

Taking a supremum over / e \ {0}, this shows that 




478 



12. Applications 



thus, letting n go to infinity, we obtain the previous bound. 



To prove a reverse inequality, we assume that Q can be written as a density 
kernel with respect to /x; namely, that there exists a measurable mapping 
q : E X E such that for any / G Bb{E), x £ E 

Q{f){x) = j q{x,y)f{y)n{dy) 

Lemma 12.4.2 Under the hypothesis that sup^gg / q{x, y)^ li{dy) < +oo, 
we have Spr(Q) < Spr 2 ,,(Q). In addition, if Q is self-adjoint (i.e., q is 
symmetric, p,® p-a.s.), then we have that Spr(Q) = Spr 2 _^(Q). 

Proof: 

We have for any integer number n > 1 and point x £ E,hy the Cauchy- 
Schwartz inequality, 

Q"(l)(x) = 



< 



j <l{x,y)Q'' ^(l)(y)p(dy) 
Jjq{x,y)^ p{dy) 



q{x,yy p(dy)iQ^-^h,^ 



Taking the supremum over x £ E and then the nth root, and finally letting 
n be large, we obtain the affirmations of the lemma. ■ 

Let Q' be the semigroup on Bf,{E) defined by Q'{f) = G^^^M{fG^^^). 
Observe that G^^^Q'{f = G M{f) for any / € Bb{E). Since the 

mapping / G^^'^f is an isomorphism from Bb{E) into itself, we conclude 

that Q{f) = G M{f) and Q' have the same spectrum, the same eigenvalues, 
and the same spectral radius (notions to be understood in the Banach 
space Bb{E)-, see for instance [200]). The main simplifications in working 
with Q' is that if M is reversible with respect to a probability p (i.e. 
p{f\M{f 2 )) = p{M{fi)f 2 )) then the same is true for Q'. In this case, Q' 
can be extended as an operator auto-adjoint in t^{p). Finally, we have the 
following equivalences. 

Proposition 12.4.2 If M{x,-) ~ p ond sup^gg ||dM(x, < 

+00 for any x £ E, then we have 



Lyap(Q) = Spr((3) = Spr(Q') = Spr2,^(<3') 




12.4 Spectral Analysis of Feynman-Kac-Schrodinger Semigroups 479 



12.4 6 Exercises 

In the next series of exercises, we analyze the connections between the 
spectral analysis of the semi-group Q and Q with 

Q{Xy dy) = G{x) M{x, dy) and Q{x, dy) = M{x, dy) G{y) 

and the Umiting distributions t/qo and ^oo (whenever there exist) of the 
Feymnan-Kac semi-group $ and $ defined by 

$(»;) = ’J(t/)M and $(t/) = '9{rfM) (12.27) 

where $ represents the Boltzmann-Gibbs transformation associated to the 
potential function G. As usually, and unless otherwise is stated, we assume 
that the potential function G is bounded and non negative and M is a 
Markov transition on some measiurable space {E,S). Finally, note that any 
bounded positive integral operator Q such that Q(l)(x) € (0,oo) can be 
written as above by setting 

G(x) = Q(l)(x) and M{x,dy) = Q{x,dy)/Q{l){x) 

Exercise 12.4.3: Suppose there exists a positive eigenvector hg € Bb{E) 
such that Q{hg) = GM{hg) = hg. We further assume that M is 

reversible with respect to some distribution /x, and we let Hg € V{E) be 
defined by fig{f) = fi{hgM{f))/y,{hg). Check that $(/Xg) = Hg. 



Exercise 12.4.4: We consider the Feymnan-Kac interpretation (12.24) of 
the log-Lyapunov exponent A(G). Using (12.25), check that 

A(G)= lim sup—^^logfj^’''>M{G) 

We further assiune that the pair conditions ((G), (M)m) are met for some 
m > 1 and some e = e{M) > 0 and r = l/e(G) < oo. Using (4.34), show 
that the updated semigroup satisfies condition ($) with 

i3($") < 2£“ V(”*"^)(l - 

Show that the (unique) fixed points (»?oo,^oo) of the mappings ($,$) are 
connected by the formulae Tfco = fjooM and fjoo = ®(»?oo)- Under the as- 
sumptions of Exercise 12.4.3, prove that »Joo(/) = P'{hgf)l ii{hg). 



Exercise 12.4.5: We consider a pair of time homogeneous potential/tran- 
sition (G, M) on some measiurable space {E, S) satisfying conditions (G) 




480 12. Applications 



and {M)m for some parameters m and (e(G),e(M)) = (l/r,e). Check that 
$ has a unique fixed point rjoo = ^{Voo)- We further assume that M is 
reversible with respect to some measure £ P{E). Show that rjoo fi 
are absolutely continuous, and 

h{x) =def. ^(X) € [c/r-.r-A] 

Using the fixed point equation, prove that for any g G Li(/i) we have 
At(y Q{h)) = fjooM{G) fi{g h) 

with the integral operator Q{f) = GM{f). Deduce firom the above that 



Q{h) = Xc h /i - a.s with Xa = rjooM{G) = »joo(G) 

Inversely, suppose that we have Q{g) = X g fi - a.s for some A > 0 and 
some non negative and bounded function g with fi{g) > 0 (By Perron- 
Frobenius theorem, A coincide with the top eigenvalue of the semi-group Q 
on Li(/i)). Then, prove that 



fjooidx) = 



1 

Kg) 



g{x) /i(dx) 



and A = Ag 



Exercise 12.4.6: Prove that the following assertions are satisfied for any 
A e R and any bounded function h 

Q{h) = Xh =► Q{g) = Xg with g = M(/i) 

Q{h) = Xh =» Q{g) = Xg with g = hG 



Exercise 12.4.7: We assume that M is reversible with respect to some 
positive measure (x. We also assume that 

Q{h) = Xh (12.28) 

for some A > 0, and some non negative function h. Let be the Boltzmann- 
Gibbs transformation associated to some positive potential function g such 
that n{g) € (0, oo). Show that the following assertions are satisfied. 

• If fi{h) € (0,oo), then the measure 

^00 =def. ^GhiK € P{E) 

is a fixed point of the mapping $, and we have A = rjooM{G). 
In addition, this result holds true if (12.28) is only met on the set 
G~^((0,oo)), and as soon as fi{hG) G (0,oo). 




12.4 Spectral Analysis of Feynman-Kac-Schrodinger Semigroups 481 



• If fi{h) e (0, oo), then the measure 

Voo ^def. ^ 'P{E) 

is a fixed point of the mapping and we have A = rioo{G). 

Inversely, suppose that $ has a fixed point r/oo ^ V{E) with a bounded 
density h = In this case, prove that for any g € Li(/i) we have 

QW) = Voo{G) /x(5 h) 

Deduce that 

Q{h) = Xg h H-&.S. with Xg = Voo{G) 

Let h' be the modificatioD of h defined by 

h' = h + {QW - XGh) 1 a with A = {Q{h) ^ Ac/i} 

Check that Q{h'){x) = Xg h'{x), for any x e E. 

Exercise 12.4.8: In this exercise, we construct a Feynman-Kac semi- 
group having a prescribed eigenvalue and eigenvector. To this end, we sup- 
pose that M is reversible with respect to some positive measure /x on some 
measurable space E. We let A > 0 and h be a fixed non negative and 
boimded function such that M{h) > 0. Check that 

G = Xh/M{h) => Q{h) = Xh 

Let 'ih be the Boltzmann-Gibbs transformation associated to the potential 
function h. Deduce that the measure ijoo = € V{E) is a fixed point 

of the mapping $ associated to the pair (G, M ) and defined in (12.27). If 
7n is the un-normalized Feynman-Kac flow associated to the pair poten- 
tial/transitions (G, M), then check that for any / € Bb{E) we have that 

7n(/) = A”+1 vo{h) EfiMXr,) n 

with fh = f/M{h), and fjo = ’^hivo)- Finally, check that 
%(/) = A"+' %(h) mMJUfH) 

where M/, is the Markov transition defined by 

Mh{x,dy) = M{x,dy)h{y)/M{h){x) 

Exercise 12.4.9: We assiune that M is reversible with respect to some 
positive measure fi. We also assume that Q{h) = Xh for some A > 0, and 
some non negative function h with fi{h) € (0, oo). If be the Boltzmann- 
Gibbs transformation associated to h, then prove the following assertions. 




482 



12. Applications 



• The measure rjoo =def. 6 V{E) is a fixed point of the map- 

ping and we have X = Voo{G). 

• The measure ^oo =def. G V{E) is a fixed point of the mapping 

Inversely, suppose that $ has a fixed point ^oo ^ with a bounded 

density h = In this case, prove that for any g € Li(^) we have 

QW) = VooM{G) fi{g h) 

Deduce that 

Q{h) = Xc h fi- a.s. with Xq = f}ooM{G) 

Modify h up to a /t-nuU set so that to have Q{h) = Ac h on all the set E. 



Exercise 12.4.10: Let (G, M) be a pair of potential transitions on some 
measurable space {E,€). We further assume that M is reversible with re- 
spect to some positive measure fi and we have ^{G) € (0, oo) and M{G){x) e 
(0,oo), for any x € E =def. ^“^((0,00)). We consider the pair of updated 
potential/transitions (G,M) on E defined in (2.13) and (2.14). 

• Check that M is reversible with respect to p, =def. ®(’^(m)) ^ 'P(E) 
where ^ , resp. ’f, denotes the Boltzmann-Gibbs transformation as- 
sociated to the potential function G, resp. G. 

• Let M be the bi-Laplace transition on J? = R defined by 

M{x,dy) = I dy 

for some c > 0. We also let G = 1[o,l] be the indicator poten- 
tial function of the interval [0,L]. Prove that the updated poten- 
tial/transitions (G,M) on [0,L] defined in (2.13) and (2.14) have the 
following form 

G{x) = M(x,[0,L]) = l-2-\e-“-l-e-"(^-*)) 

= 2- (e-<^^fe-^(^-^)) 

Also check that M on (0, L] is reversible with respect to the measure 
ft € ^((0, L]) defined by 

fi{dx) = Cl l[o,L](a^) (1 - 2“‘(e"‘^ -I- e"‘=^^“*^)) dx (12.29) 
where cl represent a normalizing constant. 




12.4 Spectral Analysis of Feynman-Kac-Schrodinger Semigroups 483 



Exercise 12.4.11: We consider the indicator potential function and the bi- 
Laplace transition [G, M) introduced in the second part of exercise 12.4.10. 
For any > 0 we set 



h0{x) = sin (c/?x) + /9cos {c^x) 

• Check that c h0{y)dy = e“ sin(c;9x) and 

j c hfi{y)dy 

- sin(c/?x) -h (sm{c0L) - h0{L)j 

• Deduce that for any x € (0, L] we have 

„ ^ g-c(L-i) 

{1+0^) Q{h0){x) = h0{x) ((l-;3^) sin (c/?I)+2;8 cos (c/?L)) 

with the bounded integral operator Q on [0, L] defined by 
Q(/) = M(/G) = GM(/) 

• If cL = 7 t/ 2, then check that for = 1 we have for any x € (0, L]{= 
(0,V(2c)l) 

Q{hi){x) = i hi{x) > 0 

• If cL € (tt/2, 7t), then prove that we can choose (3 € (7t/(2cL), 1) such 
that 

23 

tan {cL0) -I- ® 

In this situation, verify that for any x 6 [0, L] we have 

Check that h^ix) > 0 for any x € [0,7t/(2c/3)], and for any x G 
(7r/(2c^),L] 

23 1 4 - 

^ + tan(c/?x) < /3- ^ = -0 j— ^ < 0 

Deduce that /i^(x) > 0 for any x € [0, L]. 

• In both of the situations exsunined above, and using Exercise 12.4.7, 
prove that the measure rjoo{dx) a^e/. 1 [o,l](®) h0{x) dx € V{[0,L]) 
is a fixed point of the mapping $. 




484 12. Applications 



Exercise 12.4.12: Returning to the Example 4.4.1, we assume that E = Z 
and M is the Markov transition defined by 

M{x,dy) = p{-l) Sx-i{dy) + p(0) 6x{dy) + p(+l) <5i+i(dj/) 

with p(t) e (0,1) and ]C|»|<iP(*) = 1- ^ ~ 

m > 1 and let G = Ig, then prove that for any padr (x,y) € [0,m] 
^ A|j|<ip(t)"‘ and G{x) = Af(i,[0,m]) € [A|<|<ip(t), 1]. De- 
duce that the pair potential/kemel (G, M ) satisfies the regularity condi- 
tions ((G), {M)m) with e(G) > A|j|<ip(i) and e{M) > A|<|<ip(t)"‘. Show 
that M{x,dy) = M{x,dy), for any x G (0,m), and on the boundary 
dE = {0, m] 

In the same way, prove that for any x G (0, m) 

G(x) = 1, G(0) = 1 - p(-l) and G(m) = 1 - p(-l-l) 

Describe the particle approximation model associated with the pair (G, M). 



12.5 Directed Polymers Simulation 

12.5.1 Feynman-Kac and Boltzmann-Gibbs Models 

In biology and industrial chemistry, fiexible polymer models describe the 
chemical and kinetic structure of macromolecules in a given solvent. The 
pol}oner chain at time n is regarded as a sequence of random variables 

Xji = G[o,n] ~def. (f^) • • • ) Un) G En = 

(n+1) times 

taking values in some metric space {E, d). The elementary states {Up)o<p<n 
represent the monomers of the macromolecules X„. The length parameter 
n represents the degree of polymerization. The monomers are connected by 
chemical bonds and interact with one another as well as with the chemicals 
in the solvent. The energy of a polymerization sequence 

Xo = Uo^Xi = {Uo, Ui) ~^X2 = {Uo, Gi, Ga) — ^ . . . — > = G(o.„] 




12.5 Directed Polymers Simulation 485 



is defined in terms of a Boltzmann potential 

exp-;Ev^(t/(o.pi) (12.30) 

'p=0 

The parameter e € R+ represents the temperature of the solvent, and each 
potential function 

Vn : («o, . . . , Un) € En~^ Vn(«0i • • • i tin) € R+ 

reflects the local intermolecular energy between the monomer = Un in 
the polymer chain X„_i = (uo, . . . , ttn-i) dmring the nth polymerization 

.Afn— 1 ~ (tiO> • • • >tln-l) ^ — ((tiOt ■ • • >ttn— l)>tin) 

The potential functions depend on the natme of the solvent and the 
physico-chemical structure of the polymer. At low temperature, c 0, the 
interaction between monomers may be strongly repulsive at short distances 
and attractive or repulsive at larger ones. For instance, the monomers may 
tend to avoid being closed on each other. These excluded volume efiects 
and repulsive interactions can be modeled by choosing a potential function 
satisfying the following condition: 

V'n («0, •••,«„) =0 •<=»«„ ^ (12.31) 

In this situation, every self-interaction is penalized by a factor e~ < so that 

the energy of an elementary polymerization is minimal iflF the new monomer 
differs from the previous ones. In this context, the inverse temperature 
parameter e is sometimes called the strength of repulsion. In the opposite 
case, at high temperature, 6 ^ oo, the interaction forces disappear. In 
this situation, it is commonly assumed that Xn is an En*valued Markov 
chain with elementary transitions and initial distribution r^. By the 
definition of the chain Xn = this Markovian hypothesis implies that 
the Markov transitions Mn have the form 

Afn(^n— ^n) ” ^Xn-i • • • ? ^n— l)) P ri((uo, • . • , dtin) 

for any Xn-i G -Bn-i and = (uo, • . • ,Un) G and for some Markov 
transition Pn from En into P. Also note that whenever (t/n)n>o is itself a 
Markov chain, the transitions Pn have the form 

• • • » ^n— l), dUfi) = P^{Ufi—i^dUfi) 

for some Markov transitions fJi from E into E. In summary, we see that 
the distribution of an abstract polymer chain with polymerization degree 
n is defined for any fn G Bb{En) by the Feynman-Kac distribution on path 
space 



Vnifn) “ 7n(/n)/7n(l) 




486 12. Applications 



7n(/n) = E nGp(Xp)j=E|^/„(t;(0,„,) nWo.Pl)j 

** (12.32) 

and the potentitd functions 

Gn-XneEn •-+ Gn(x„) = CXp | € [0, 1] 

Another important quantity is the Feymnan-Kac prediction model t/„ de- 
fined as above by taking in formula (12.32) a product up to time (n - 1). 
More precisely, we have 



Vnifn) = 7n(/n)/7n(l) with 7n(/n) = E |^/„(X„) Gp(Ap) j 

It is finally important to recall that the so-called partition function 7n(l) 
can also be expressed in terms of the flow (»?p)p<n with the product formula 

n n—1 

7n(/n) = Vnifn) H = VnifnGn) (12-33) 

p=0 p=0 

This rather elementary description allows us to construct a natural un- 
biased and on-line particle estimation of the partition function (see for 
instance Section 11.3). 

In the biostatistics hteratine, these Feynman-Kac models seem to be 
preferably expressed in terms of the Boltzmann-Gibbs distribution densities 



9n(Xn) OC exp| - — I p„(x„) 



The function p„ represents the density of a random sequence X„ = C/[o,n] 
taking values in a multidimensional state space = (R'*)”"''^ The inter- 
action energy of a given sequence x„ = (uq, . . . ,ttn-i,«n) = (xn-i,«n) is 
measured by the Hamiltonian function defined alternatively by one of 
the equivalent expressions 

n 

■Hn(Xn) ~ I*p(^) • • • ) ^p) ~ Hn— 1 (Xfi— l) "t" I*n(Xn— 1> ^n) 



In this notation, the Feynman-Kac formulae (12.32) can alternatively be 
rewritten in terms of the Boltzmann-Gibbs distribution 



VnidXn) = 



1 _-//n(Xn)/£ 



i/„(e-""/®) 



I/„(dXn) 



(12.34) 




12.5 Directed Polymers Simulation 487 



where Un € 'P{En) represents the distribution of the path = t^(o,n]* 

It is of course out of the scope of this section to present a complete and 
precise catalog on all the polymer simulation models that can be derived 
using the particle methodology described in chapter 11. The reasons are 
twofold. First there does not exist a universally effective particle simula- 
tion algorithm that applies to all directed polymer models. Although all 
of these simulation techniques are built on the same particle methodology, 
their accuracy really depends on various tuning parameters such as the 
selection period, the branching rules, population size controls, the choice 
of the mutation transitions, the length of the exploration excursions, and 
others. In addition, the improvements we can get using one or another re- 
finement strategy are strongly related to the precise nature of the chemico- 
physical interactions of the model at hand. On the other hand, and due to 
the incomplete knowledge of the intermolecular potentials in real macro- 
molecules, there does not exist a precise chemico-physical polymer model. 
In order to get some insight, the hterature abounds with simplified models 
with diverse repulsive and attractive interaction energy landscapes. 

This section is organized as follows. In Section 12.5.2, we provide a brief 
discussion on evolutionary particle algorithms and their connection with 
more traditional Metropolis type models such as the slithering tortoise or 
the reptihan type algorithms. In the next two Sections 12.5.3 and 12.5.4, 
we have collected some commonly used simplified models with repulsive 
and attractive interactions. We also imderline some natmral connections 
with self-avoiding and reinforced random walks. In the final section, 12.5.5, 
we apply the genealogical-tree-based methodologies presented in Chapter 3 
(see also Section 11.4) to sample polymer chains associated with a given 
collection of intermolecular potentials. 

12.5,2 Evolutionary Particle Simulation Methods 

One challenging question in biostatistics is to generate independent sam- 
ples according to the Feynman-Kac or Boltzmann-Gibbs distributions on 
path space introduced in (12.32). Several strategies have been suggested in 
the literature, including traditional Monte Carlo methods such as Metro- 
polis-Hastings type models [216). In the context of self-avoiding random 
walks, two classical Monte Carlo strategies can be underUned: the Berreti- 
Sokal, or the shthering tortoise algorithm [30], and the reptilian or sUth- 
ering snake algorithm [210, 308). These models first consist in modifying 
randomly and locally the monomers of a given chain with fixed polymer- 
ization degree. Then these small deformations are accepted or rejected. 
The recurrent drawbacks of these Monte Carlo algorithms are the follow- 
ing. The potential energy function usually has too many local minima, 
and its oscillations tend to slow down the convergence of the algorithm. 
Furthermore, they are not recursive with respect to the polymerization 
degree parameter. In this connection, we mention that if we are only in- 




488 



12. Applications 



terested in generating macromolecule samples with a fixed polymerization 
degree n, then we can take advantage of the Boltzmann-Gibbs representar 
tion (12.34) and alternatively use the interacting Metropolis approximation 
models presented in Section 11.2 and in Chapter 5. Various heuristic-like 
6md recursive strategies have been recently suggested, such as chain growth 
methods [26, 167, 226, 235, 115] and the pioneering Rosenbluth’s pruned- 
enrichment technique [280]. The basic and common strategy is as follows. 
We start with a random sequence of macromolecules of a given polymerizar 
tion degree, say pi. We compute the Boltzmann weight of each one. Then 
we properly eliminate bad configurations with high interaction energy and 
replace them by cloning the good ones. In biostatistics literature this se- 
lection stage is often called “enrichment”. Finally, we grow each selected 
macromolecule up to a polymerization degree p2(> Pi)> so on. Here 
again the choice of the selection/mutation transitions is not unique. In 
some situations, the growing mechanism is almost dictated by the problem 
at hand. For instance, suppose that represents the path distribution of 
a self-avoiding random walk I/[o,n]- this case, it is tempting to choose 
a growing/mutation transition that chooses randomly among the “free” 
neighbors (see [280]). Another idea is to send out an auxiliary collection of 
exploration paths of a given length to test the local environment. Then we 
select one of them in accordance with its interaction energy (see [235]). A 
wide range of selection/enrichment mechanisms have also been suggested 
in the literature. Some of them are based on branching selection variants 
such as those presented in Section 11.8 (see [235]). Some authors also sug- 
gest the use of weights’ thresholds to detect the “best” selection period and 
avoid excessive duplications (see [167, 227, 307]). 

Numerical results tend to indicate the superiority of these evolutionary 
type algorithms, but to our knowledge no sufificient analysis has been done 
to justify that these natural models are well-founded. The Feymnan-Kac 
representation model (12.32) and the particle recipes developed in Chap- 
ter 11 clearly show that eadi of these algorithms coincides with a partic- 
ular particle interpretation of the Feynman-Kac formulae (12.32). These 
evolutionary particle algorithms are essentially dictated by the dynamical 
structure of the Feynman-Kac model (12.32). In this connection, we men- 
tion that in biostatistics literature these recursions are preferably expressed 
with some obvious abusive notation as follows: 



^n(dXn) oc r}„-i{dXn-i) M„(x„_i,dx„) exp{-[ffn(a:n) - Hn-l(Xn-l)]/e} 



12.5.3 Repulsive Interaction and Self-Avoiding 
Markov Chains 

At low temperature, e 0, and imder appropriate regularity conditions, 
the Feynman-Kac measures (12.32) with repulsive interaction potentials 




12.5 Directed Polymers Simulation 489 



(12.31) converge in some sense to the distributions 7 defined by 



7n(/n)=E 



n 



fniXn) 

p=0 



(12.35) 



with the indicator potentid functions 

G’„ = lg and ; u„ ^ {uo,...,u„_i}} 

(12.36) 

When the underlying stochastic sequence is such that 

7 n(l) = P(Xi€Ei,...,X„€E„) 

= P(V0 <p<g<n, t/p #C/,)>0 

then the distribution flow is well-defined and we have 

fin = Law(X„ I € £ 1 , . . . , G En) 

= Law({/[o,„]| VO <p<q<n, t/p # f/,) (12.37) 

An Excursion Feynman-Kac Model 

The simplified directed polymer model described in (12.37) can also be 
regarded as a path-garticle evolution model in an absorbing medium with 
hard obstacles E„-E„. More precisely, if we let T = inf {n > 0 : Xn^E„} 
be the first time X„ exits the set E„, then % represents the law of the path 
particle X„ = t^(o,n] given the fact that it has not been absorbed at time 
n. In other words, in this notation we have that f}„ = Lsw(X„ | T > n). 

Self-Avoiding Random Walks 

One of the simplest mathematical models with self-repulsive interaction 
is the self-avoiding rsmdom walk model (abbreviated SAW). This rather 
elementary probabilistic model is often used in practice mainly because 
various authors seem to agree that it captures some qualitative features of 
polymer conformations. An SAW of length n is defined as a realization of 
the path of a simple random walk on the d-dimensional lattice E = Z'^ that 
visits points no more than once. In more precise language, an SAW of length 
n is a random path X„ = f^[o,n] distributed according to the Feynman-Kac 
distribution introduced in (12.37). Recalling that the Markov transitions 
of Un are defined by P{u, .) = (2d)' and assuming that it 

starts at the origin Uq = 0, we readily chew that the partition functions 
are given by 

7„(l) = P(V0<p<q<n t/p / C/,) = |Sn|/(2d)” (12.38) 

where Sn is the set of self-avoiding random walks of length n and starting 
at 0 . In the same way, we check that is the uniform distribution on S« 




490 



12. Applications 



and 



*7n+l(U0> • • • lUfi) Wn+l) — |g I ls„(W0) • • • ) Wn) 2 ^ ^ ^ lu„+e(Wn+l) 



Ie|=l 



Note that in this particular case we also have r)n{Gn) = |Sn|/(|Sn-i|(2d)). 



Related Repulsive Interaction Models 

Repulsive interactions can also be modeled by Boltzmann type potentials 
(12.30) with Vfx = Igc or V^(uo, ...,Un) = 5Zp=o ^{up}(^n)- The latter 
potential corresponds to the Edward model and will be discussed in Ex- 
ercise 12.5.3. To model repulsive interactions at larger distances we can 
use for instance the excluded-volume potentials functions Gn{uo,...,Un) = 
lv„-i(t«o,...,u„_i)'(wn), where V„_i(x„_i) is agiven neighborhood of x„_i = 
(uO) • • • ) Wn-i)- Note that in this situation we have 

fjn = Law(t/[o,n)l VI < p < n Up^ Vp_i(%p_ij)) 
as soon as %(1) = P(V1 <p<n Up^ Vp_i(t^[o,p_i])) > 0. 



12 . 5.4 Attractive Interaction and Reinforced Markov Chains 

The attractive interaction situation is closely related to self-interacting and 
reinforced Markov chains. For instance, if we choose the potential functions 

n — 1 

• • • ) ^n) “ ^ ^ lup(^n) 

p=0 



then during polymerizations the monomers are attracted to each other. In 
addition, when f/n is a homogeneous Markov c^n with transition P' on a 
coimtable set P, then the Markov transitions Mn^i defined in (11.16) are 
now given for any Xn = (uq, . . . , Un), j/n = • • • > ^n) ^ and v G P 

by 



M„+i(x„,(j/„,v)) = IxAVn) N Up(t>) 






as soon as Y,q=o Vq) > 0. 

12.5.5 Particle Polymerization Techniques 

We can clearly combine the Feynman-Kac modeling techniques presented 
in this section with the particle recipes described in Chapter 11 and Sec- 
tion 12.2 to design a collection of particle approximation and simulation 




12.5 Directed Polymers Simulation 491 



models. In this context, the corresponding genealogical-tree-based algo- 
rithms can also be interpreted as particle polymerization models. For in- 
stance the simple genetic iV-particle model associated with the Feynman- 
Kac model (12.32) consists of N polymer chains with degree n 



= and Cn = iQ,n,---,Cn,n)^En 



During the selection stage, we randomly choose N polymer chains with 
common law 



N 



L 



C.«) 






(12.39) 



This mechanism is intended to favor minimal energy polymerizations. For 
instance, in the case of repulsive interaction (12.31), a given polymer with 
degree n, say (Co,n» • • • i Cn,n)> “ore chance of being selected if the last 
monomer Q „ added during the nth sampled polymerization differs from 
the previous ones; that is, if ^ {Co,„, . • , Cn-i.n}- During the mutation 
transition, each selected polymer evolves randomly according to the 
transition M„+i of the path chain 



— ^[0,n| ^ -^n+1 — t^(0,n+l] — (^[0,n]) ^n+l) — {^n^Un+i) 

at time n -I- 1; that is 






((Co,n+l’ • • • ) Cn,n+l)> Cn+l,n+l) 

((a,n. ,C.n). C+l,„+l) G £^n+l = (En X £)(12.40) 



where CA+i,n+i ^ ^ random variable with distribution P„(^^, .). Various 
asymptotic estimates can be derived from Chapters 7 to 10. For instance, 
if we let be the distribution of the first g-path particles of polymer- 
ization degree n 



Pi,"’"! = uw|(cj,„, . . . ,ii,j (a, a,n)i 



then using Theorem 8.3.3 we have the following proposition. 

Proposition 12.5.1 For any q < N and n > 1, we have the strong 
propagation-of-chaos estimates 

l|P^/^’’^-»?f’lltv<c(s,n)gVA^ 



for some finite constant c{e, n) whose value only depends on the pair time 
and cooling parameter (n,e). 




492 



12. Applications 



Loosely speaking, this result shows that particle models produce asymp- 
totically independent blocks of random variables with common law r\n- In 
this sense, we can say that particle interpretations are particle simulation 
techniques for sampling polymers with a given Boltzmann-Gibbs measure. 
Moreover, mimicking the product formula (12.33), we construct a natural 
particle approximation of the partition functions 7n(l) by setting 

7n (1) = n »?^(exp {-Vj,le)) (12.41) 

p=0 



where Vn = jf Sili „,Co n> - >Ci n) iV-approximation mear 

sures of the prediction Feynman-Kac flow »j„. Precise asymptotic properties 
of these unbiased estimators can be found in Chapters 7 to 10, including 
central limit theorems and exponential estimates. 



Conditional Mutations 

An alternative particle polymerization technique consists in using the sim- 
ple genetic iV-particle model of the distribution flow defined in (12.32). 

This particle simulation strategy is again defined by a genetic selection/mutation 
mechanism. Dming the selection stage, we choose randomly N polymers 
In = (^,n. • • • . c,n) ^ith common law 



^n(Co,n> • • • iCn.n) o 

2^ /^j \ ^(Co.n’ *‘Cn.n) 

i=l 2-rj=l '-^nVS0,n> • • * > Sn,n; 

with the potential functions Gn given for any Xn € En by the formula 

^ Vn-H f Vn-H(«n.t*) 



During the mutation transition, each selected particle = (Cp,n)o<p<n 

evolves randomly according to the transition Mn+i defined in ( 11.16). 
That is, we have that 

^n+l ““ ((Co,n+l» * * • ) Cn,n*fl)» Cn+ljfi+l) 

“ ((Co,n» »Cn,n)> Cn+l,n+l) ^ = En X E 



where Cn+i,n+i ^ ^ random variable with distribution 



(X Pn+i(^,dw) exp 



K>+i(f;,«) j 




12.5 Directed Polymers Simulation 493 



Self- A voiding Particle Models 

At low temperature, € -> 0, the Feynman-Kac pol)rmer measures (12.32) 
with repulsive interaction take the form (12.35) and the discrete selection 
distributions (12.39) tends to the uniform measure 



»6/'^(n) 



■Ci.n) 



with I^{n) = {i ; (Co,n> • • • >Cn,n) S ^n}- In this situation, it may happen 
that all the N polymerizations have intersected and /^(n) = 0. In Sec- 
tion 7.4.1, we have seen that this event has an exponentially small proba- 
bility. When the potentials are indicator functions, it is more judicious to 
use the particle algorithm associated with the McKean interpretation model 
(11.10). In this situation, the selection transition consists in sampling each 
polymerization = (Co,„, >C,n) according to the distribution 

<5n,m((„)(^n> •) = + IgcC^n) i rjv/ \i 



The polymers = (Co td • • • > Cn,n) ^ without self-intersections are not 
affected by the selection stage, and we set In the opposite case, the 

polymer chains = (Co,ni • • ■>Q,n) ^ with self-intersections are killed 
and replaced by a collection of polymers = (Co,n> > Ci,n) randomly 

and uniformly chosen in the set {(Co,n> • • • > Cn,n) i 7 ^ I^^(”)}- Arguing as 
in (12.41), we construct a particle approximation of the partition functions 
%(1) by setting 

7^'(1) = l.^>„ n(|/'"(p)|/Ar) (12.42) 

p=0 

where = inf {n > 0 : /^(n) = 0} represents the first time the particle 
algorithm is stopped. We again refer the reader to Chapters 7 to 10 for 
precise asymptotic properties of these particle estimates. 

In Figure 12.5, we have presented a self-avoiding pol}rmerization model 
associated with AT = 7 particles. The dotted lines stands for killed self- 
intersecting lines and the thick lines represent the branching evolution of 
self-avoiding pomylers. 



Related Particle Models 

The particle simulation models described above can be refined in various 
ways. For instance, we can improve the accuracy of the exploration grid us- 
ing the conditional branching exclusion strategies described in Section 11.7. 
The resulting particle model is again decomposed into a mutation/selection 




494 12. Applications 




FIGURE 12.5. Particle polymerizations 



transition, but in the former the particle mutation also depends on the po- 
tential functions. The precise and formal description of this genealogical- 
tree-based simulation algorithm is notationally time-consuming and it is 
better understood using the abstract and general models presented in Sec- 
tion 11.7. Roughly speaking, it is described as follows. Initially, we sample 
N independent copies {Uq)i<x<n of Uo- FVom each one, we evolve N' explo- 
ration paths (f^[ofpj))i<j</V' of a given length, say pi (> 1), and we choose 
randomly one of these auxiliary excursions with a probability proportional 
to its Boltzmann weight 



«“(%.))= n <3»(W(if,|) = exp(-i ■£ n(y«|) 

0<fc<pi Y 0<fc<pi / 

Gfc(uo, • . • , tife) = exp . . . , «jk)| 



The resulting sequence of excursions = Cjp,p,j can be regarded as 
iV-approximation samples from the conditional distnbution 



-approximation samples 



ff^\d{uo, . . . ,«p,-i)) « e ' '^*(“0- -"*) n^^\d{uo, . . . ,Up,-i)) 



where represents the distribution of the random sequence Xq'^ = 
{Uk)o<k<pi- To define the next two-step selection/mutation transitions, 
we again evolve from each C[op,) sequence of N' independent excursions 

^[pi.w) Starting at I/p,_i = Cp,-i (for some 
length (p 2 -pi) (> 1)). With some obvious abusive notation, we denote by 
the Markov transition from l/[o,p,) t^[o,pj) = (^o,pi)>t^[pi,P 2 )) 



Mi’’\u,d{v,w)) = 5u{dv) P(t/[o,pi) G dw \ l/[o,p,) = v) 




12.5 Directed Polymers Simulation 495 



By construction, we have the estimate 

j=l * ' 



The first selection transition 

= iC[0,pi))l<k<N — > = (^,p,))l<fc<Af 

consists in randomly choosing N paths, say = C|o,pi)> proba- 

bility proportional to 

J = 1 \ Pl<Q<P2 J 



"J” Mr(Gf’>)Kfc„) 



with 



(«o,- -,«pj-i) = n 

Pi<k<P3 

The conditional mutation 

c(p).fc_?k vc(p).fc_f?fc X 

SO ^(o,pi) ^ SI 'Mo.pi)’^(pi.pj)^ 

consists in extending each selected path ^,p,) = C[o,p,) 

iliary excursions, say C|^,,pj) = ^^i’fpa)> randomly chosen with a probability 
proportional to 



exp 



H ? 

\ Pl<9<P2 



nKfco.'C.'ii) I =Gi'’K;oV)'t'£,-i,)) 



To define the next selection/mutation stages, we again evolve from each 
path a sequence of N' independent excursions (f/^ P 3 ))i<j<iV' of 
the chain f/[pj,p,) starting at Up,-i = C^_ii and so on. Note that the N'- 
sequences are used to generate approximate polymerizations with 

initial distribution fjl^^ and the “conditional” mutation transitions 
For a more formal and precise presentation of these branching strategies, 
we refer the reader to Section 11.7 (see also Sections 11.5 and 11.6). 



12.5.6 Exercises 

Exercise 12.5.2: [Rosenbluth’s pruned-enrichment model [280]] We con- 
sider the SAW model described on page 489 on the square lattice d = 2. 




496 12. Applications 



Check that in this case the transitions Mn of the path- valued Markov chain 
Xn = U[o,n] are given for any x„ = (uo, . . . , Un), J/n = (vo, • • • , Vn) G (Z^)" 
and t; e Z* by 



Mn+l{Xriy{ynyV)) = liniVn) ^ ^ ^ fun+e<(l^) 

t=l 

with Cl = (1,0), 62 = (0,1), 63 = (-1,0), and 64 = (-1,-1). Given a 
sequence Xn = (uot • • ■ > «n) of polymerization degree n, we denote by 

C(x„) = {6j ; 1 < i < 4, Un + ei ^ {uo, . . • ,Un}} 

the set of indexes of available directions for placing the next monomer 
without intersecting x„. Note that a given SAW x„ may be trapped in the 
sense that it cannot be extended to a new SAW; i.e., C(xn) = 0. Prove 
that whenever |C(xn)| ^ 0 the “conditional” transitions Mn+i defined in 
(11.16) are defined for any j/„ = (uq, • • • ,Un) G (Z**)" and v G Z** by the 
formula 



Afn+l(®n, (l/n, u)) — li„(j/n) |f'(^n)| lun+e(u) 

e6C(xn) 

A Markov chmn with these transitions, is sometimes called a myopic SAW 
or a “true” SAW (see [6]). Finally, check that the mutation stage in the 
particle model associated with the pair (Gn, Afn) consists in extending 
each path, avoiding the occupied neighbors, and the potential function 
counts the proportion of occupied neighbors aroimd the last visited site; 
i.e., Gn(Xn) = |C(^n)|/4. 



Exercise 12.5.3: [Edwards’ model [130]] Suppose Un is the simple random 
walk on E = Z**, and the potential functions G« are given by 



G„(uo,...,u„) = exp 




l{up}(«n) 



Show that in this case the unnormalized Feynman-Kac distributions intro- 
duced in (12.32) are given by the formulae 



7n(/n)=E /„(G[ 0 ,„|)exp|-i Y. 

\ ^ 0<p<q<n 



These measures are sometimes called the weakly SAW or the Domb-Joyce 
model. There exist various conjectures related to this polymer model. For 
instance, the order of magnitude of distance \Un\ between the endpoints 




12.6 Filtering/Smoothing and Path estimation 497 



of the polymer = f/|o,n] is Qot known in dimensions d = 2,3,4. For 
d > 5, Kara and Slade have proved in [173] that |f/„| and for d = 1 

Greven and den Hollander have checked in [169] that |{7n| — n. Construct a 
genealogical tree model to generate approximate samples of the Edwards- 
Domb-Joyce model. 



Exercise 12.5.4: [Lawler [214]] We consider the SAW model described on 
page 489. Using the fact that any (p+ 9 ) SAW is a concatenation on p and 
9 steps prove that jSp^.,! < jSpj jS,]. Recalling that an SAW cannot return 
to the most recently visited site, check that d" < |S„] < 2 d (2d- 1 )”~*. By 
subaciditivity arguments, prove that the connective constant c(d) defined 
by c(d) = limn-»oo jSnP'^" exists and c(d) 6 [d, ( 2 d- 1 )]. The exact values of 
c(d) are unknown. Using (12.42) and (12.38), propose a particle estimation 
of these connective constants. 



12.6 Filtering/Smoothing and Path estimation 

12.6.1 Introduction 

Feynman-Kac distributions and their particle approximation models play 
a major role in the theory of nonlinear filtering. We recall that the filtering 
problem consists in computing the conditional distributions of a state sig- 
nal X given a sequence of observations Y. To understand the motivation 
behind this problem, we can think of the signed X as being the Markovian 
model for the time evolution of a target in tracking problems. The obser- 
vation process Y represents the noisy and partial information delivered by 
some sensors such as RADAR (Radio Detection and Ranging) or SONAR 
(Sound Navigation and Ranging). Of course, the exact values of the signal 
X and the values of the various disturbance sources are not known but it 
is reasonable to assmne that we know their statistical structure. 

Filtering problems arise in various application areas, including applied 
probability, engineering science, and particularly in advanced signal pro- 
cessing, as well as in financial mathematics and biology. They provide a 
natural prediction/updating probabilistic model for the on-line estimation 
of some quantity evolving in some sensor environment. Each applied scien- 
tific discipline tends to use a different language to express and analyze the 
same filtering problems. For the convenience of the reader and to better 
connect these appUcation areas, we have collected four different ways to 
introduce a nonlinear filtering problem. 

In the first probabilistic interpretation, the signal/observation pair is re- 
garded as a two-component Markov chain. In engineering literature, we 




498 



12. Applications 



instead start with a Markov signal process given by a dynamical physical 
equation, and the observation sequence is instead given by a sensor equa- 
tion. Another abstract way to introduce the filtering problem consists in 
introducing a new reference probability measure. The last interpretation 
comes firom the Bayesitm literature. 

A Markov chain filtering model 

Let (X,y) = {(Xnjyn) ; n > 0} be a Markov chain taking values in 
some product spaces {(£?„ x F„) ; n > 0}. Here {(F„,F„) ; n > 0} is an 
auxiliary sequence of measurable spaces. We further assume that the initial 
distribution i/q and the Markov transitions T„ of {X, Y) have the form 

«^o(d(xo,yo)) = 9o{xo,yo) M<^o) Qo{dyo) (12.43) 

Tn((Xn— 1, J/n— l)) 1/n)) ~ Qni^ntVn) ^ni^n—ltdXn) Qn{dyn) 

(12.44) 

where, for each n 6 N, : F„ x F„ -)■ (0, oo) is a strictly positive function, 
G V{Fn), Tjo € ViEo) and M„ are Markov transitions from J5„_i into 
En. 

Engineering presentation 

In engineering and advanced signal processing literature, an alternative and 
more classical way to define the pair (signal/observation) Markov process 
{X, y) is as follows. The signal is a Markov chain with transition prob- 
ability kernels M„ and taking values at each time n in some measurable 
space {En, ^n)- fri some instances, is described by a dynamical equation 

Xn = Fn{Xn-l,Wn) (12.45) 

where Wn represents a sequence of independent random variables taking 
values in some measurable space (5“,<S^) and : F„_i x F„ is a 
given measurable drift function. In this case, we readily check that 

M„(/„)(x„_i) = E(/„(F„(Xn-l, Wn))) 

The observation process is defined for each n > 0 by a sensor equation 

Yn = Hn{Xn,Vn) (12.46) 

The sequence Vn is independent of X and represents the noise sources. It 
consists of a collection of independent random variables taking values in 
some auxiliary measurable spaces (5", 5"). For each n > 0, the random 
variable Vn is distributed according to a probability measure € ^(5"). 
The collection of measurable functions ifn : x 5" -> F„ is chosen so 

that 

E (■ffn(®n) In) ^ ^1/n) ~ 5n(Xni3/n) Qn{dVn) (12.47) 

for each Xn G En- In other words, the laws of /f„(x„,K») and Vn are 
absolutely continuous and gn{xn, •) is the corresponding density. 




12.6 Filtering/Smoothing and Path estimation 499 
A change of reference probability model 

This technique is particularly useful in modeling continuous time nonlinear 
filtering problems. In the discrete time case, the idea is to consider the 
canonical process associated with the chain (X, Y) with initial distribution 



(fi = X ■P’n), G = {GnUeN, (X, K) = (X„,y„)„eN, P) 

n>0 

Let P be the probability measiure on defined by its restrictions Pn 

to fin — rip=o(‘^p ^ 

Pn(^((^0) J/o)» • • • » 2/n))) “ • • • j ^^n)) Pfi (^(j/O) • • • ? ^2/n)) 

with 

^ 

P„(d(xo,...,dx„)) = rjQ{dxo)Mi{xo,dxi) 

y 

P„ {d{yo, ■■■, dyn)) = Qoidyo) qi{dyi ) • • • Qnidyn) 

In other words, under this new reference measure P, X is again a Markov 
chain with initial distribution rjo, and Y = (yn)n>o is a sequence of random 
variables independent of X and independent with respective distributions 
q = {qn)n>Q, Let Pn be the restriction of P to fin* By the definition of the 
Markov kernel (12.44) of the chain (Xn.Yn) imder P, the distributions Pn 
and Pn are absolutely continuous with one another. Their Radon-Nikodym 
derivatives are defined for Pn-a.e. ((xo,2/o)i • > (^n>2/n)) ^ by the for- 

mula 

dP 

;^(((a^o,yo),---,(a:n,J/n))) = JJ 9p{xp,yp) 

p=0 

Using one of these interpretations, we find that 

• • • ) A^n) ^ d(xo, . . . , Xn), (Vq? • • • > Yn) G d(yOj • • • > 2/n)) 




9p{^p^ Vp) 



X [^(dXo)A/i ( xQ) dXi) . . . , Afn(^n— 1 ) dXn)] 



A Bayesian filtering presentation 

In the Bayesian literature, the authors sometimes abuse the notation and 
adopt a simplified and intuitive presentation of a filtering problem. In this 
notation, the conditional distributions of Yn given are instead denoted 
by 



Pn^^{yn\Xn) dyn 



F{Ynedyn\Xn=Xn) 

¥{Hn{Xn,Vn)edyn) 




500 12. Applications 



The quantity dyn has to be understood as a given probability measure on 
(Fn, To connect this notation with (12.47) we have 

Pn^^iVnlXn) dy„ = pn(Xn, !/n) 9n(dyn) 

In other words, dy„ stands for gn(dy„), and represents the likelihood 
potential function 

Pn^ (j/n|®n) ~ 5n(®ri) 1/n) 

In the same line of ideas, the elementary transitions of are instead 
written in this field as 



p^(x„li„_i) dXn = P(X„ € dXn I X„-i = X„_i) 

The quantity dxn is more difficult to connect appropriately to our abstract 
Markov kernels. The notation above must to be thought of as 

P^(Xn|x„-i) dx„ = M„(x„_i,dx„) and Po (xq) dxo = go(dxo) 

Some authors also suppress the superscripts and (.)-^ and the time 

index. In this simplified notation, we have 

P((-^0) ■ • • 1 -^n) ^ d(xo, . . . , X^)) 

— P(xo) p(Xi|xo) . . *p(XnlXu_l) dXQ . . .dXn 

and 

P((^) • • • I ^n) G d{yo , . . . , J/n) I (-^0) • • ■ ) Xn) = (xq, . . . , X^)) 

= P(Po|Xo) • . -PCPnlXn) dpo ■ • • dPn 

12.6.2 Motivating Examples 

The literature on Bayesian statistics, sequential Monte Carlo methods, and 
other engineering sciences abounds with applications of particle algorithms 
to filtering problems. It is clearly out of the scope of this section to present 
a precise catalog on all of these applications. We rather refer the inter- 
ested reader to the list of referenced articles. To illustrate this section and 
better connect the particle methodology developed in this book with the 
existing applied literature, we provide a brief discussion on some typictJ 
filtering problems currently studied in engineering literature (see another 
complementary series of examples provided in Section 12.6.2). 

Positioning and Tracking Problems 

One typical estimation problem arising in engineering literatme is to esti- 
mate the d}rnamics of a moving object evolving in some sensor environment. 




12.6 Filtering/Smoothing and Path estimation 501 



For instwce, in classical tracking problems, we estimate a target motion 
using radar or sonar observations. The physical measurements are often 
related to some signal arrival time delays or Doppler effects. In the context 
of globed positioning systems (GPS), the electronic device delivers posi- 
tion estimates by measuring arrival times of a series of signals emitted by a 
satellite [43, 202]. In mobile robot localization problems [148, 111, 212, 282], 
the measurement data are collected from the robot’s observations, such as 
its distance to a wall. In people tracking problems, we first need to design 
a simplified human body model. Then the observation process is as usual 
related to some image/audio sensors [184, 187, 188]. In navigational posi- 
tioning problems, the ships are equipped with devices that measure their 
relative range with respect to some reference point [292]. 

To illustrate this rather general class of models, we present a simple posi- 
tioning problem in wireless commimication networks. This example is taken 
from [259]. The signal process is a simple Markov chain that represents the 
random evolution of a vehicle. The components of the state vector = 
(X^, X^) represent respectively the position, velocity, and acceleration 

coordinates. The location components depend on the network of streets 
and roads on which the vehicle travels. For the pair speed/acceleration 
components, we can use the physical model described in the introductory 
Section 1.1. The vehicle X„ evolves in a wireless radio environment. At 
each time, we receive radio measurements from several base stations on the 
position of the vehicle. Assiuning that these stations are located at some 
fixed sites, say B', ie I,& generic model of multisensor measurements is 

Y^ = d{XlB') + v:,, iel 

where d(., .) represents some pseudo-distance criterion and a collection 
of independent sensor perturbations. In these wireless network positioning 
problems, the vehicle process Xn often uses sharp turns, and its random 
dynamics are strongly nonlinear. As a result, this filtering problem is far 
from being hnear/Gaussian, and an extended Kalman-Bucy filter often 
offers poor estimation results. Notice that this elementary model can be 
extended in various ways. For instance, we can consider moving base ra- 
dio stations or multiple vehicle trackings or consider position tracking of 
microcell and mobile phones. The latter application area has recently re- 
ceived much interest. More details, as well as precise comparisons with the 
traditional Kalman filter approach, can be found in the referenced articles. 



Multiple Models Estimation 

Let X}^ be a Markov chain taking values in some measurable space 
(equipped with a <r-algebra with initial distribution and elemen- 
tary transitions . Given this chain we suppose the pair process (X^, 1^) 




502 12. Applications 



is a given R^'^’-valued Markov chain defined by the recursive relations 



{ 



XI 

Yn 



+ n>l 

CM)X^ + Cn{X},) + V^, n>0 



(12.48) 



for some measurable mappings (i4„, Bn,Cn) from into the sets of ma- 
trices and some drift functions (o„,c„) with appropriate dimensions. As 
traditionally, the sequences of random variables Wn and V"„ are indepen- 
dent and independent of Xq and X^. They take values in R'*"’ and R*^“ 6ind 
are distributed according to a centered Gaussian distribution with covari- 
ance matrices 



K = E{Vn K) and R: = E{W„ 

Given X^, the initial random variable Xq is a Gaussitm random variable 
in RP with a mean and covariance matrix that only depend on Xq and are 
denoted by 

Meano(Xo') = 

Covo(Xo') = E((X2-E(X2 |Xo'))(X2-E(X2|Xo'))'|Xo') 

These linear/Gaussian models arise in various application areas such as 
in multimodel estimation. In this context, the process represents the 
possible values of the system parameters as well as the different noise levels. 
For instance, in the space shuttle orbiter entry model proposed by Ewell 
in [142], when the acceleration enters below some level, the shuttle dynamics 
switch to some cruise navigation. Related switching models associated with 
judicious thresholds can be found in [236] and [292]. 

These multimodel filtering problems are often solved numerically by us- 
ing a judicious hypothesis-testing technique on a collection of likely hnear- 
optimal filters associated with each possible value of the system parameters. 
The only interaction occurs when we combine these models appropriately 
to obtain the output estimate. These rather well-known techniques go back 
to a pioneering article of Magill [238] on system identification and published 
in 1965; see also Bar-Shalom and Fortmann [24] for applications to missile- 
tracking models with different types of maneuvers. These ideas are also 
related to model-fusion strategies. In the latter, the multiple Kalman pre- 
diction models are regarded as measurements delivered by a virtual sensor. 

These hypothesis-testing algorithms provide quite accurate results when 
we have a small number of Ukely hypotheses (see for instance [292] and 
references therein). In more general instances, the structure of the set of 
hypotheses is more complex and may also vary in time. As mentioned by 
Stengel in [292], page 405-406, one natural idea is to refine the filter adap- 
tation by dropping the filters associated with less Ukely hypotheses and 
duplicating the ones associated with the most probable set of parameters. 




12.6 Filtering/Smoothing and Path estimation 503 



The engineering literature on tracking mtineuvering targets or on failure 
detections abounds on heuristic-like algorithms based on these evolution- 
ary ideas. We refer the interested reader to the filter-spawning method 
presented by Fisher in [145] in the context of the VISTA F-16 actuator 
failure estimation or the switching algorithm [224] as well as the interact- 
ing multiple model algorithm (IMM) of Blom and Bar-Shalom [34]; see 
also [221, 222, 283, 284, 314] for precise application models. Relat^ inter- 
esting schemes can be foimd in [58, 224, 225, 241]. In Section 12.6.7, we 
will show that the Feynman-Kac modeling and the particle methodology 
described in this book provide a natural and firm theoretical treatment on 
multiple fusion estimation models. 

Stochastic Volatility Estimation 

The extended Black-Scholes model describing the dynamics of the price of 
a given risky asset is defined by the stochastic equation 

dYt = Ytirdt + XtdVt) (12.49) 

where r is an instantaneous interest rate and Vt is a standard Wiener mo- 
tion. Assume that the observed volatility process Xt satisfies the equation 



dXt = -a {Xt - Xo)dt + bdWt 

for some fixed parameters a, 6 > 0 and a standard Wiener motion, inde- 
pendent of Vt. If we discretize the time using the Euler approximation with 
a fixed mesh A, then we obtain with some obvious abusive notation the 
discrete time filtering model 

Xn = X„_i-a(X„_i-Xo)A-h6v/AW„ 

Yn = rn-i(l + rA-l-A„v/A V„) 

where Vn and Wn are independent sequences of iid standard Gaussian vari- 
ables. Note that, using the explicit solution of (12.49), we can alternatively 
use the discrete observation model 

= y„_, exp ([r - Xl/2]A + A„v/A V„) 

We notice that, using a classical state-space enlargement technique, we can 
include the parameters (a, b) in the state space. In a different but related 
context, Viens has adapted in [302] a general particle-filtering method of 
a joint work of the author with Jacod and Protter [90] in order to address 
the question of stochastic volatility filtering in financial math. He has used 
this method for solving a stochastic portfolio optimization problem under a 
partially observed stochastic volatility model, using elements of stochastic 
control, and providing a Monte Carlo method that solves the filtering and 
the stochastic control problem in imison. Stochastic volatility estimation 




504 12. Applications 



has been proposed using particle methods for several years. The most pop- 
ular method consists of invoking filtering by an ARCH/GARCH model, as 
proposed by Nelson in [255]. Related models and numerical methods can 
be foimd in the chain of articles [35, 37, 53, 131, 147, 155, 257). 



Hidden Markov Models 

Hidden Markov chains (HMM) are particular extunples of filtering prob- 
lems for which the signal/observation model has a fixed and determin- 
istic component. We assume that the unknown component 0 belongs to 
some measmrable space (5,5), and we associate with each 9 the pair sig- 
nal/observation model defined as in (12.44) by the formulae 

i^e,o{^{xo,yo)) = 9efii^o,yo) V 0 ,o{dxo) Qoidyo) (12.50) 

7ff,n((®n-l>yn-l))<f(®n)J/n)) = Pa,n(®ni 1/n) llffl,n(3Jn-l) dx„) 9„(dj/n) 

(12.51) 

In the display above, pj.n, r]e,o and Me,n are collections of positive func- 
tions, measures, and Markov transitions on appropriate state spaces (see 
page 498) and indexed by 0. The HMM problem is as follows. We observe 
a series of measurements Y„p< n, corresponding to some unknown value 
of the parameter say 9*. These HMM and related stochastic autoregres- 
sive models occur in various application areas, including in speech recogni- 
tion [190], biology [56], neurosciences [150], and economics [171, 172, 52]. 

The numerical estimation techniques fall into two categories, namely the 
maximum likelihood and the Bayesian estimators. These two approaches 
are discussed below. 

In the Bayesian approach, we suppose that the unknown parameter 9* 
is a reedization of some random variable 9 with distribution r € 7^(5). In 
this situation, if we take X„ = (Xn,9), then we see that the pair sequence 
(A’„,y„) is again a Markov chain of the same form as the one described 
in (12.44). These ideas can be extended in a natural way by considering 
the unknown parameter 9 as & realization of the initial condition 9q of an 
auxiliary Markov chain 9„. This Bayesian methodology proposes a way to 
reduce the HMM problem to a classical filtering problem. 

The maximum likelihood estimators are defined as the sequence of par 
rameters 9„ that maximize the conditional log-likelihood functions defined 
by 



An(9,9*) 

{np=o5tf.p(®P’^)} ^0,o(dxo)riF=i A^ff,p(^p-i<dxp) 

(12.52) 

where (y^,)n>o represents a series of observations of the parameter 9*. 




12.6 Filtering/Smoothing and Path estimation 505 



12.6.3 Feynman-Kac Representations 

To simplify the presentation, we fix the sequence of observations Y = y. 
A version of the conditional distributions of the signal states given their 
noisy observations is expressed in terms of Feynman-Kac formulae of the 
same type as the ones discussed above. More precisely, let Gn be the non- 
homogeneous function on En defined for any i„ G En by 

G’nC^'n) “ fln(®n> J/n) (12.53) 

Note that Gn depends on the observation value yn at time n. In this no- 
tation, the conditional distributions of the path X(o,„j =<je/. (.Aq, . . . ,X„) 
given the sequence of observations yjo.n) =dtf. • • • > Yn) fi:om the origin 
up to time n are given by the path Feynman-Kac measures 

Qn(^(®0> • • • ) ®n)) 

= Pn(A(o,„] € d{xo, ...,*„) |y[o,n] = (j/O, • • • , yn)) 

if" 1 

= -^ Jl Gp(xp) X [7jo(dio)Mi(xo,dxi)...,M„(i„_i,dx„)] 

L p=o 

with the normalizing constants 



n 

n ^p(^p) ^ (Tto(da;o)Mi(xo,dxi)...,Af„(i„_i,dx„)] 

L p=o 

(12.54) 

The prediction and updated marginal distributions are defined for any test 
function /„ € Bb{En) by 

»?n(/n) = 7n(/n)/7n(l) and ^n(/n) = %(/n)/7n(l) 
with the imnormalized distributions 

n-l \ 

/n(X„) UGp(Ap) and %(/„)= 7n(Gn/n) 

p=0 / 

Due to the choice of potential functions (12.53), the distributions r/n and 
coincide respectively with the one-step predictor and the optimal filter 

7/n = Law(Xnl Vjo,n-l] = (j/O, • • • iVn-l)) 

Vn{f) = Vn{fCln)/^n{Gn) — Law(Xn | ^[0,n] = (j/O) • • • >2/n)) 

with yjo,n] = {Yoy • • • ) ^n)* Notice that the normalizing constants Zn intro- 
duced in (12.54) coincide with the quantities 7n(l) = 7n(C?n) and they can 





506 12. Applications 



be expressed in terms of the prediction flow r/n with the product formula 

n 

Zn = %W = Ylvp{Gp) (12.55) 

p=0 

Taking the logarithm, we obtain the so-called conditional log-likelihood 
functions 

An = login = (12.56) 

n -l- 1 n -f- 1 

p=0 

It is also interesting to examine the situation where X is a path-space 
model; namely, suppose we have that 

Xn = Xfo.n) =def. (X' , . . . , X^) € En = ^.n] =def. (E'o ^ ■ X E'J 

where X' is a Markov chain taking values in some measurable spaces 
{£„,€„) with initial distribution rjo and transitions M^. In this situation, 
the obwrvation sequence (12.46) takes the form 

y„ = /rn(xfo,„,),K.) 

This means that the information delivered by sensors at each time n de- 
pends on the whole path of the signal X' back from the origin and up to 
time n. Note that in this case the function 5n((3!o)--->2;n))yn) depends 
on the current observation ¥„ = Vn and on the whole path-coordinates 
(xq, . . . ,x(,). This type of sensor is in fact much more general than those 
arising in practice. In classical 61tering problems, the observation sequence 
is instead dehned by 

y„ = ff;(x;,Vn) 

for some appropriate function Hn-E'„xSn~^ Fn and the resulting func- 
tion gni-fVn) only depends on the endpoint coordinate x^ of the path 
(xq, ...,*(,) that is 



9n{{x'o , . . . , x'„), Vn) = Vn) 

for some strictly positive function : E'^ ->]0,oo[. We emphasize that 
in this particular situation the pair process {X'„,Yn) has the same form 
as before. It is a Markov chain taking values in the measurable spaces 
{E'^ X Fn). The initial distribution and the Markov transitions of (X',Y) 
are defined as in (12.43) and (12.44) by replacing {gn,M„) by {g'^,M^). 
FVom these observations, one concludes that 



= Law(X; | yjo.n] = (l/o, . • . , Vn)) 
rjn = Law(X[o,„] | yjo.n] = (yo, . . . , Vn)) 



(12.57) 




12.6 Filtering/Smoothing and Path estimation 507 



In connection with the engineering presentation of a filtering problem given 
in (12.45) and (12.46), we observe that the random sequence 

W[0,„] =def. {Xo, Wu...,Wn)eEoXSr^...xS:;: 
forms a Markov chain and versions of the conditional distributions 
rjn = Law(H^(o,„] | yjo,„) = (i/o, • • • , Vn)) 
are also given by the Feynman-Kac path measures defined by 



^n(/n)=7n(/n)/7n(l) with 7n(/n) = E ^/„(W^( 0 ,„)) J] ^ 

The functional representations of the conditional distributions presented 
above clearly belong to the same class of Feynman-Kac distribution flow 
models discussed in this book. In filtering literature, the nonlinear evolution 
equations of these models are usually called the nonlinear filtering equa- 
tions. The two major problems concern the study of the stabiUty properties 
and the long time behavior of these equations and then their numerical so- 
lution. The first question is related to the fact that the initial condition 
of the signal is usuaJly unknown and any filter, even the optimal Kalman- 
Bucy filter, in the linear/Gaussian situation is initialized using erroneous 
parameters. The second question is more recurrent in applied literature; 
namely, how to solve the filtering equation. Except in some very partic- 
ular situations, the optimal filter equation is a nonlinear equation in an 
infinite-dimensional state space, and it is known that there does not exist 
any finite realization (see for instance [50]). We emphasize that the two 
questions above are intimately related. For instance, the stability prop- 
erties of the filtering equations ensure that local numerical errors do not 
propagate. We recall that these robustness properties allow us to derive 
several uniform convergence estimates with respect to the time parameter 
(see Section 7.4.3). 

The stochastic analysis and the particle methodology described in this 
book give some partial answers to both of these problems. For instance, 
the stability properties of the filtering equations can be derived using the 
Feynman-Kac contraction properties discussed in Chapter 4, Section 4.3. 
On the other hand the numerical solution of these measure- valued processes 
can be conducted using the particle methodology developed in Chapter 3. 
To avoid unnecessary repetition, it is of course out of the scope of this 
section to review all the consequences of these results. Because of their 
importance in practice and for the convenience of the reader, we provide in 
the next two sections two rather short discussions on these two problems 
with some precise references to chapters and sections on these subjects. 




508 



12. Applications 



12.6.4 Stability Properties of the Filtering Equations 

The long time behavior of the filtering equation can be studied using the 
contraction properties of the nonlinear Feymnan-Kac semigroups derived in 
Chapter 4. This chapter presents several functional contraction inequalities 
for Feymnan-Kac semigroups with respect to various entropy-like criteria. 
In filtering settings, these semigroups have natural interpretations in terms 
of conditional distributions. To be more precise, it is first convenient to 
introduce a simplified system of notation. For any k,l>0, we set = 
{Yk, ..., Yk+i). We also slightly abuse the notation and, for any n<k and 
1 > 0, we write 



PiVk^^ I = x„) 




k+l 

Yl9p{xp,yp) 

p=k 



\ 



k^l 

U Mp{xp—i,dxp) 
p=n+l 



(12.58) 



the conditional density of the random vector given X„ = x„ (with 
respect to the (I + l)-tensor distribution (qk® ...® Qk+i)) and evaluated 
at This abusive and complex system of notation is currently used in 
Bayesian statistics as well as in engineering literature. Notice that when- 
ever the observation sequence is fixed, the quantities (12.58) only depend 
on the parameter x„ € E„. Recalling that G„(xn) = 9n(xn,yn), the den- 
sities (12.58) are better expressed in terms of the pair of Feynman-Kac 
semigroups (Qp,n,Qp,n) introduced in Section 2.7 and defined by 



Qp,n(/n)(Xp) = Ep,,, |^/(X„)nGfc(Xfc)j 

Qp,n(fn)(Xp) = Ep., f/(X„) n Gfc(Xfc) 

V fc=p+i 

Indeed, an elementary manipulation yields that 

Gp,n(Xp) ~def. Qp,n(l)(^p) — P(yp \ Up ~ ^p) 

Gp,ni^p) — def. Qp,n(l)(Xp) = P(l/p+l I ^p ~ ^p) 

FVom previous observations, it is also not difficult to check that the^nor- 
malized Feynman-Kac semigroups {Pp,n,Pp,n) associated with (Qp,n,Qp,n) 
have the following interpretation: 

Pp.n(Xp,dx„) = P(X„€dXn|i;Vi'=y;;i\Xp = Xp) 

^p,n(®p»dXn) = P(.^n G dXn | Ip+i ~ Pp+1> ^P ~ ®p) 

To get to the final step in our discussion, we recall that the nonlinear 
semigroups ($p,n,^p.n) of the conditional distribution flows (»7n,t7n) are 




12.6 Filtering/Smoothing and Path estimation 509 



expressed in terms of the Markov transitions (Pp,n) Pp,n) and the potential 
functions (Gp,„,Gp,„) with the formulae 

^p,n(M) = ^p,nit^)Pp,n ^p,n(M) = $p,n(/l)Pp,n (12.59) 

where ('i'p.n, ^p,n) are the Boltzmann-Gibbs transformations associated 
with the pair (Gp,„, Gp,„) and defined by 

^p,n{(J’){fn) — f^ifn(^p,n)l t^{Gp,n) and ’^'p,n(M)(/n) = M(/nGp,n)/A*(Gp,n) 

We finally recall that these conditional distributions can be regarded as 
the transitions of a nonhomogeneous Markov chain. This observation com- 
bined with the Boltzmann-Gibbs representations (12.59) of the semigroups 
(^P,n,$p.n) is one of the key points of our approach to the stabihty of 
Feynman-Kac semigroups developed in Section 4.3. More precisely, for any 
p < q < n, we have the decompositions 

Pp,„ = and Pp.n = ^"iP,.n 

with the M6u*kov transitions 

pW(xp,dx,) = P{X,€dx,\Y;^,^=y;;l,X, = x,) 

~ ^ ^9 I ^^1 ~ Vp+1' ~ ®p) 

To give a flavor of the stability properties that can be deduced from Chap- 
ter 4, let us assiune that the signal transitions Mn satisfy the mixing 
condition {M)m for m = 1 and some sequence of numbers €n{M) with 
e = infn£n{M) > 0 (see on page 116). Note that this condition ensures 
that the Markov transitions Tn of the pair Markov chain {Xn, Yn) given in 
(12.44) have the same mixing property; that is, 

^n((^n-l» J/n-l)> d(Xn, J/n)) ^ ^ ^n((^n-l» J/n-l)? d(x^, J/^i)) 

Rephrasing Proposition 4.4.2, we And that 

/?(^!5+i) ^ Gp,n{Xp) > e Gp,n{Xp) 

for any pair (®„,x(,) G E^. We conclude that the semigroup $p,„ of the 
optimal filter is exponentially asymptotically stable in the sense that 

0{Pp,p+n) = sup ||$p,p+„(pi) - $p,p+n(/i2)||tv 

A»i.M2 

= sup ||P(Xp+„ e . I yp'V\" = Xp = Xp) 

Xp,x'p 

-P(Xp+„ 6 . 1 Tp-VY = 41^, Xp = xplltv 
< (l-£^)” 




510 



12. Applications 



If we take (/ii, /i2) = {Vo^i • • • Afp, r/p), then we find that 

»,,p+nW = P(JT;^n6.|V5",“ = 0 

*p, ,+.(«) = P(X,+„€. !>?+" = i/T”) 

FYom previous inequalities, we readily deduce the uniform estimate for the 
approximation and finite m-memory filters 

IIP(Xp+„, € . I = y^^T)-nxp+m e . \ yr"* = < {i-e'^r 

12.6.5 Asymptotic Properties of Log-likelihood Functions 

The uniform and strong exponential stability estimates provided in Sec- 
tion 12.6.4 also appear to be useful in the asymptotic analysis of the con- 
ditional log-likelihood functions introduced in (12.52). Suppose the pair 
of signals/observations (X„, Yn) forms a time-homogeneous Markov chain 
with initial distribution i/q and elementary transitions = Tg given in 
(12.50) and (12.51). FVom the product formula (12.55), we first find that 

K{0,0*) = ^J^^l0g7Jgyp-ip((?«,yp) 

p=0 

= log Ve,o (Ge,Yo ) + — ^ log Vg,Y^ ,pMg {Ggy ^^ , ) 

p=0 

where Yq represents a series of observations or the parameter 6* and with 
some obvious and usual abusive notation 

V6,yo,n{dXn) — Pg(Xn € dXn \ 1^ = J/o) 

^e.vr\nid^n) = Fg{X„ edx„\Yr'^=yr^) 

^O.yni^n) = ff^(^n)2/n) 

for any realization sequence j/J ^ (Fq x . . . x Fn). With some obvious abusive 
notation, we have 

PeiVnA-i I Vo) 



"" VB,y^,n^9{G9,yn+i) 

~ ^ I ~ Po) j J/n+l) 

Suppose that the Markov transitions Mg satisfy the mixing condition {M)m 
for m = 1 and for some sequence of numbers eg{M) such that 

s = ii^eg{M) > 0 




12.6 Filtering/Smoothing and Path estimation 511 



Under this rather strong uniform mixing condition, all the estimates derived 
in Section 12.6.4 remain valid. For instance, we have 

mXp+n, € . I = CT) - MXp+m e . I = I/g+'")||tv 

< (1 - e^)”* 

(12.60) 

Let be the solution of the filtering equation starting at some er- 

roneous initial condition tjq, and let A'^{0,9*) be the corresponding log- 
likelihood function. Arguing as in Section 12.4 and assuming for simplicity 
that supj ||< 7 a|| < oo, we conclude that 

|log^e,yJ.nMe(Ge,y„+i) - log^,„n,„Mfl(Gfl,„„+,)| < 2e~^ (1 - e^)” 

and therefore n||A(, — An|l < c(e), Pfi.-a.s., for some constant whose values 
only depend on £. This shows that the conditional log-likelihood function 
does not depend asymptotically on the initial condition of the filter. In 
much the same way, if we denote by An"*^ the log-likeUhood function as- 
sociated with the finite m-memory filter, then by (12.60) we readily prove 
that 

llAi"*) - Anil < 2e-\l - (1 - m/n)+ 

with o'*" = max(a,0). We fix the memory length m and we denote by 
{Un)n>m-i the {E X F)”‘'*'^-valued Markov process 

G„ = ((An — m+l» m-f 1 ),..., (An+l,yn+l)) 

Under our assumptions, we have for any Un,u^ e {E x 

7r-^‘(«n, .)>£”*■'' 7T^n<,-) 

where T$ represents the Markov transition of Un- This shows that Un is 
exponentially asymptotically stable and it has a unique invariant measure 
ue- We readily deduce the almost sm*e convergence 

lim A^’")(0,r) = A("*)(0,0*)=def.E,,.(log^^’;;; ,y._.M,(G<,,y„)) 

This result can be proven using for instance the Poisson equation and 
classical martingale convergence theorems. Recalling that 

the limit criterion is often rewritten as follows 

A(-)(0,r) = E,,.(logp,(y,„|yo”'-')) 

Using related arguments, we can prove that A„(0,0*) converge almost 
surely with respect to the law of the stationary process (X„, ¥„) and as 
n -4 00 to some deterministic function A(0,0*). These results combined 
with some appropriate regularity conditions on the function 0 -4 (ffe, Me) 
imply that A{0,0*) < A{0*,0*) with the equality if and only 'd0 = 0* (see 
for instance [121] and references therein). 




512 



12. Applications 



12.6.6 Particle Approximation Measures 

We first recall that the flows pn and Vn solutions of nonlinear equations 
with various McKean interpretations (see Section 11.3 and Section 2.5.3). 
Each McKean interpretation is attached to a different evolutionary particle 
approximation model (see Chapter 3). To give a numerical sound to this 
section, we roughly describe the evolution of the simple genetic approximar 
tion model. In the latter and between two observations, the particles evolve 
as independent copies of the signal. When an observation arrives, we select 
randomly better-fitted individuals with respect to their likelihoods. This 
simple algorithm can be refined in various ways using the particle recipes 
presented in Chapter 11. For instance, we can change the sampling distri- 
bution and use conditional exclusion type mutations to refine the precision 
of the stochastic grid (see for instance Section 11.7). Notice that in this sit- 
uation the pair potentials/transitions (Gn, Af„) introduced in (11.16) have 
the form 



T}o = ?{Xo €dxo\Yo = yo) 

C'n(®n-l) ~ P{Vn I ®n-l) 

Mfi{Xfi—l,dXn) — P(A^n ^ dXn | .^n— 1 ~ ®n— 1? In J/n) 

More generally, the triplets {rjl^\Gn\ corresponding to the extended 

excursion model and defined in (11.20) take the form 

= P(X[o,p,) G dx[o,p,) I Tjo,p,) = y[o,pi)) 

^^l(®[Pn-l>Pn)) ~ P(l^[Pn>Pn+l) I "^Pn-l ~ ®Pti-l) 

and 

^^^^®[Pn-l.Pn)> d®[p„,p„+i)) 

= P(^[p„,p„+i) € d®[p„,p„+i) 1 -^Pn-1 “ ®Pn-l> I^Pn-Pn+l) “ !/(pn,Pn+l)) 

The accuracy and the computational cost of the selection stage can also be 
improved using one of the branching rules proposed in Section 11.8. 

Each branching particle model has a natural birth and death interpretar 
tion that induces the important notions of the ancestral line of each current 
individual and the corresponding genealogical trees. A review of these path- 
space models is provided in Section 11.4. The occupation measures of these 
path-particle historical processes provide a natmal approximation of the 
laws (12.57) of the path of a signal given a series of observations. More 
precise models can be foimd in Section 3.4 and their asymptotic analysis 
is described in full detail in Chapters 7, 8, 9, and 10. We also mention 
that, mimicking the product formula (12.55), we construct an on-line and 
unbiased particle estimation of the normalizing constants introduced in 
(12.54). We again refer to Section 11.3 for some details on these particle 
approximation models. 




12.6 Filtering/Smoothing and Path estimation 513 



12.6.7 A Partially Linear/Gaussian Filtering Model 

Quenched and Annealed Feynman-Kac models 

In this section, we examine the nonhnear filtering model with a linear/Gaus- 
sian component described in (12.48) on page 502. To simplify the presen- 
tation we restrict the presentation to homogeneous measurable mappings 
(An,Bn,Cn) = (A,B,C) and null drift functions (o„,c„) = 0. The ex- 
tension to the general case is straighforward (see Section 2.5.4). We use 
the modeling techniques presented in Section 2.6 to introduce a quenched 
Kalman-Bucy equation. In this context, the quenched flow is Gaussian and 
can be solved explicitly for any realization of the randomness. The annealed 
distributions are difficult to solve in practice. In the filtering context, they 
represent the conditional distributions of the “nonlinear part” of the signal 
given the observations. We connect the smnealed and quenched models in 
terms of the Feynman-Kac model in distribution space introduced in Sec- 
tion 2.6. Let us denote by Af{m, P) a Gaussian distribution on R** with 
mean vector m 6 R^ and covariance matrix P € 



Af{m,P){dx) = ^ exp ~{x-m)P ^x-m)' dx 



By direct inspection, we see that the pair signal Xn = e En = 

EW X W forms a Markov chain. Its Markov transitions take the form 



Mn((Xn— 1 j l)j 2^n)) \^n—lidXn) 

with the Gaussian transition on 

Mi%{Zn-l,dz„) = P((A(Xn) z„_i -H B(x„) W„) G dz„) 
Similarly, the initial distribution of the pair (Xo,Xq) is given by 

T/o(d(xo, 2 o)) = rf^lo{dzv>) 

with = A/’(Meano(xo), Covo(xo)). This pair signal model is clearly 
of the same form as the one discussed in Section 2.6. More precisely, the 
distribution of the path (Xq, . • • ,Xj[) is defined by 

P^(|) {d{xQ,...,Xn)) = r{o\dxQ) M{^^(xo,dxi)...Mj^‘)(x„_i,dx„) 

and, given a realization = x = (x„)„>o € (F?^^^)***, the second compo- 
nent X^ = (X*)„>o forms an R^-valued Markov chain and the conditional 
distribution of the path =def. (Xq , . . . , X^) is defined by the formula 



. . . , Zn)) = rf'xloidzo) M^*Ji(zo, dzi)... M^l„{Zn-udZn) 




514 12. Applications 



Prom this expression, we notice that only depends on (lo, • • • ,®n)- 
Let Gn • X RP (0, oo) be the likelihood functions defined by 

Gn{Xn,Zn)=9n{{Xn,Zn),yn) = 



From the considerations above we find that 

P f)Oin(-^[0,n] ^ d{{xo, Zo)> • • • , (®n> ^n)) 1^(0, n] “ (l/0i • • • , = 1/n)) 



= 4-' {nUGp{xp,z,)} p(S,^(d(xo,...,x„))Pg^d(^o,..^ 



with X[0,„] =def. (^0. • • • , Vio.n] =def. (^0, • • • , V'n), and the normaliz- 
ing constant 2„ > 0. If we write -X’jo „j =def. {Xq, . . . , for t = 1, 2 then 
we find that 

Pf;o,n(-^[0,n] ^ ^(^Oi • • • t Zn) | -^[o.n] ~ • • • ' > 1^") ) 

~ ^{xln {rip=0 Gp{Xp, Zp)| P[jj*_„(d(Z0) • • • > Zn)) 



with the normalizing constant -^|x],n > 0. The marginal distributions are 
defined for any / 6 66(R’’) by the Feynman-Kac formulae 

- 'T'[x]!n(/)/'y[i|!n(^) 

with 

-rSlJf) = Eg.| n G.„p(Ar=) j (12.61) 

with the ’’random” potential functions 

Gx„,n '• Z G R** ^ Gx„,n{z) = Gn{Xnt z) G (0, OO) 



It is also convenient to consider their updated versions 



^x],n(f) = %inU)!%lnW with 7g^„(/) = 7f^/_„(/Gx„,n) (12.62) 



_s;(2) 



5(2) 



( 2 ) 



.(2) 



The annealed marginal distributions on are defined for any /„ G 
Bb{E^^^) by the Feynman-Kac formula 



vl^Hf) = 7i‘H/)M‘^(l) with 7i‘^(/) = E,» 




^nHf) = ^n\f)/lnH^) with 7 i^^(/) = 



/(Xi) UGpiXp) 




12.6 Filtering/Smoothing and Path estimation 515 



In our context, these Feynman-Kac flows represent the one-step predictors 
and the optimal Alters 

= Law(Jf2 |y(o,„_i) = (j/o, . . . , J/n-l) , „| = (xo, . . . , x„)) 

= Law(X^ I y[ 0 .„] = {yo,..., Vn) , „j = (xo, . . . , x„)) 

and 



= Law(X;[ I y(o,„_i] = (yo, • • • , yn-i)) 
= Law(Xi|y(o,„] = (yo,...,y„)) 



Quenched Kahnan-Bucy Filters 

The quenched marginal distributions can be solved using the traditional 
Kahnan-Bucy filter (see Section 2.5.4). More precisely, for any realization 
of the chain = x and for any sequence of observations Y = y, the 
one-step predictor and the optimal filter are Gaussian distributions 






K2) - 



^ KH 



As traditionally we slightly abuse the notation and suppress the dependence 
on the observation sequence. In this notation, we write xj^n ~ and xj^\ 
instead of ^(Vy)n ^(Vy)n* synthesis of the conditional mean 
and covariance matrices is carried out using the traditional Kalman-Bucy 
recursive equations (see Section 2.5.4). For n = 0, we recall that the initial 
conditions of the latter are given by 

~ = Meano(xo) and P~q = Covo(xo) 



The Alter equation is decomposed into the traditional two step updat- 
ing/prediction transitions 



{Xil-,P-J 



^ Prediction ^ _ 






These two mechanisms are defined as follows 



• Updating: This transition depends on the current observation y„ = 
Pn and it is defined by the relations 

i Xi^l = Xi^'+G,,„(yn-C(xn)X<^-) 

1 P,,„ = (7-G,,nC(x„))P-„ 

with the gain matrix 

G,.« = Px:„ C-(x„)' (C(x„) p-„ C{XnY + K]-^ 




516 12. Applications 



• Prediction: This transition does not depend on the observation and 
it is given by the simple relations 

/ = MXn+l) Xi% 

\ *^x,n+l .<4(Xn+x) Px,n ■'^(®n+l)^ "I" B{Xn+l) .^n +1 ^{^n+lY 

A Feynmem-Kac Model in Distribution Space 

In our context the Feymnan-Kac model in distribution space presented in 
Section 2.6.2 is defined in terms of the Markov chain 

x; = eE> = X v{w>)) 

FVom previous considerations, the second component is a random Gaussian 
distribution 

It corresponds to the one step predictor associated with a realization of 
the chain X^. Its evolution in time is given by the Kalman-Bucy equation, 
which can be written in terms of a measmre-valued process 

with initial condition 

»?1X»].0 =^(-^x!o -,P-o) = A/'(Meano(Xi),Covo(Xo^)) 

The nonhnear nature of the filtering problem leads to a collection of map- 
pings 

that preserve the subset Gauss(RP) c P(R*’) of Gaussian distributions on 
R**. From Kalman-Bucy recursions, we find that 

^n+l((^‘>v),^^(m,P))=M(Me8aln+l((u,v),(m,P)), Cov„+i(u,t;)) 

( 12 . 63 ) 

The quantities Mean„+i((«, v), (m, P)) and Cov„+i((«, v), (m, P)) are com- 
puted using the following updating/prediction rules. 

r Mean„+i((u,v), (m,P)) = A(v) m(u) 

\ Cov„+i((u,v),(m,P)) = A(v) P(u) A(vY + B(v) P ^^.1 B(vY 

with the updated pair (m(u),P(«)) defined by 

f m(u) = m + G„(u) (j/„ - C{u) m) 

\ P(u) = (/-G„(«)C(«))P 




12.6 Filtering/Smoothing and Path estimation 517 



with the gain matrix G„(«) = P C{u)' [C(u) P C{u)' + For more 
details we refer the reader to Section 2.5.4. 

The Markov chain X'^ has transitions defined for any f G Bb{E') and 
(u,T/) e E' by 



Kifn)i%v)= /,j. 

We also see that the elementary transition of the distribution component 
V[x^] n ^ deterministic given the first one This can be summarized by 
the synthetic formula 









)) 



We consider the annealed potential functions 






;(i,/i)= f 



fi{dz) Gn{x,z) G(0,oo) (12.64) 



Since we have 



G'^{u,M{m,P)) 



Ar(m,P)(G„,„) 

dJ^jCju) m,C{^)PC{uY + K) ,.. 

jkren DtI^ (l/n)(12.65) 



we conclude that 

nn (2) ^ C(x,)P-nC(Xn)^ + K) , 

^n(^n,'n[x],n) dA/’(0, iZ") 

We finally associate with the pair the distribution flows 

on E' defined for any /' G Bb{E') by 

ri'nU') = in{f')hni}) and W)=%{m%W (12.66) 



with 



n— 1 



yM') = K{/'(K) nW) 
V p=0 ) 



and %{r) = WG'r.) 



By Proposition 2.6.4, if we choose /'(x,r/) = /(x) for some / G Bb{E^^^), 
then we find that 



Vnif) = = E,„(/(X,1) |y(0,n-l) = (yo, . • . ,yn-l)) 

wo = C^(/) = E^(/(Xi)|y(o.„i = (yo,...,yn)) 

In the same way for any / G Bb{R^) we find that 

Vnif) = E„o(/(X2) I = (yo, . . . ,yn-l)) 




518 12. Applications 



as soon as f{x^r]) = r/(/), and 



as soon as f'{x,ri) = f]{Gx,nf)/v{Gx,n) 

Much more is true. If we consider the signal/observation filtering model 



f K 
1 



- (^n)^[X‘],n) 



(12.67) 



with the quenched innovation sequence 



Vxi,n = Vn - E(y„ I y(0.„_i,, = y„ - 

then we find that 

Tin = Law(X‘, „ 1 yjo.n-l] = (Vo, ■ ■ ■ , Vn-l)) 
f}'„ = Law(X* , „ I 7(0, „) = (yo,..., Vn)) 

Speaking somewhat loosely in this interpretation we see that the potential 
function 

« exp (-i(y„ - C(Xn) MT)KM~Hyn - C(Xn) MTY] 



with R^(x„) = (C(x„)P~„C(Xny+R^), represents the probability that the 
observation ¥„ = yn would be made given Vio.n-i] and value X* = x„. 
The observation model in (12.67) is sometimes called a ‘Virtual sensor” in 
the literature on multimodel estimation. For static models = Xq taking 
values in a finite set, the filtering equations associated with (12.67) coincide 
with the so-called multiple h}rpothesis testing algorithm (MHT). 



Interacting Kalman-Bucy Filters 

In this section, we briefly discuss the simple genetic model associated with 
the Feynman-Kac distribution flow rj^. By construction, we first notice that 
the algorithm consists here of N (state, measure)-valued particles 

C = (CMt.) and f; = (C,M‘n)GE' = £;(i)xGauss(R»') 

The initial configuration (q = (Q, fi^) is defined by N independent random 
variables Q with common distributions tiqK The N measure components 
are simply given by Hq = with initial mean and covariance 

matrix 

mo = Meano(Co) and Pq = Covo(Co) 




12.6 Filtering/Smoothing and Path estimation 519 



The selection transition consists in randomly choosing N particles = 
(Cni^) common law 



^ G'niCnitAt) c 



If we set fin = .A/’(mJ,,f“) 6 Gauss(R*'), then by (12.65) the weights are 
given at each step by the Radon-Nikodym derivatives 



Gn(Sn)Mn) - ^n(G,^i,n) - <W(0 Ry) 



During the mutation transition, the evolution of the selected (path, mea- 
sure) peirticles 

f; = (C/i‘n)^cUi = (c;+i,/iUi) 

is defined as follows. First, each selected particle Q evolves according to the 
transition M^+i so that <Ui are conditionally independent random vari- 
ables with respective distributions .). Then, given the selected 

states Cn Cn+i> measure component fi„^i is defined by the deter- 
ministic transition 



K 






From (12.63), we find that 

P-) 6 Gauss(R'’) ^ -P^+i) € Gauss(R»>) 

with 



m'n+i = Meann+i((C,C„+i),(m;,P‘)) 

P'+i = Cov„+i((C,C+i),(mj.,P‘)) 

Let rjn^ be the particle density profiles associated with this iV-interacting 
Kalman-Bucy filter 

i=l 

The asymptotic behavior of these empirical measmes is discussed from 
Chapter 7 to Chapter 10. To illustrate the impact of these results, here we 
give next a simple Lp mean error estimate presented in Section 7.4. 

Proposition 12.6.1 For each p > I, and any f € x 'P{W’)), we 

have for each N > I 

^ nWn^if) - Vnim" < a{p)b{n) ll/ll 




520 



12. Applications 



Note that the potential functions G'„ do not satisfy the regularity condi- 
tion (G) stated on page 115. Nevertheless, we can prove Proposition 12.6.1 
combining the arguments developed in Section 7.4 for general non negative 
potential functions with some traditional cut-off arguments. 

As traditionally, if instead of the Markov chmn we 

consider the Markov chain in path space 

< = (4,n|.lg.„) 4..|=(-^0 

then the empirical measures associated with the resulting N genealogical- 
tree-based algorithm 

1 ^ 

~ jV *^((Co,n> •>Ci,n)>/'n) 

i=l 

converge as IV oo to the distributions in path space 
Vn = Law(X[|,„],J7[J\,_„ I y(o,„_i) ) 
and the same Lp-estimates hold. 

12.6.8 Exercises 

Exercise 12.6.2: Let = (Vn)n>i be a sequence of independent random 
variables such that E(K») = 0 and ffn = < oo. We consider a 

sequence of observations y„ = X -I- VJ, of a single random variable X that 
we assumed to be independent of V and such that a = E(X*)'/^ < oo. 

• If X" = i Ep=i then check that E((X - = n~^ 

and conclude that 

n 

Jiim n-^ 53 < 00 E(X | y|o,„]) = X in L^CP) 

• We further assume that X and Vj, are Gaussian random variables, and 
we set X„ = E(X | y(o,n))- Using for instance the Kalman recursions 
provided on page 515 check that 

■X^n+l = -^n + Pn+l(5^n+l ~ -^n) 

with the gain term ff„+i = g„/(g„+(T^+i), where = E((X-X„)^). 
Using the fact that = (1 — Pn)9n-it show that 

9n+l/9n = 1 - 9n+l = 9n V(<?n ^ + ^n^l) 

Conclude that X„+i = (qn+i/9n) X„ + (l- qn+i/qn) K„+i and 




12.6 Filtering/Smoothing and Path estimation 521 



Exercise 12.6.3: This exercise is taken from [121]. We extend the HMM 
model presented in Section 12.6.5 to the time index Z. We fix some param- 
eter 0 6 5 and an observation sequence y = (yn)n€Z- For any p,q,n € Z 
with p < 9 < n, we slightly abuse the notation and we let 

be the solution of the filtering equation starting at rj$^p = Jx, with x e 
at time p G Z. In the same abusive notation, check that 

^ ^P,nM${Ge^y^^i){x) =P0{yn-^l I Vp^l^^p = 

• Derive a Feynman-Kac representation of Pp^n{xp ^ .) and prove that 
Ep^n ~ Ep^qPq^n 

^^)(x,dx') = Pa(X,€dx'|yp’Vi=y?+i,Xp = x) 



• C!onclude that for any (x,x') € and ip,p') G we have the 
uniform estimates 



||Pp,„(x, .) - Pp.,„(x', .)||tv < 0{PpVp',n) < (1 - £2)-(pV) 



and 









< 2e“^(l - £2)n-(pvp') 



• Let be the distribution of the stationary Markov chain (Xn, Yn)nez 

with time index Z. Deduce that (log j,n^^(^^»V'n-»-i))m>o is a 

uniform Cauchy sequence that converges P^*-a.s. to some Aoo,n(^i ^ 
Li(P^*) whose values do not depend on x. 

• For p' = 0 and p = -m, m € N, show that 

||P0,n(x,.)-P-„.,n(x',.)|ltv<(l-eV 

and 

5 2s-'(l-eY 

Deduce the imiform Pt)«-a.s. estimate 

|log^,:v^n.„A/,(G,,y„,J - Aoo.n(^,^*)| < 2e~\l - 
Since (Aoo,n(^)^*))n >0 forms a P®. -stationary sequence prove that 
lim An(ff,0*) = E®(Aoo,o(^,^*)) (P«* *a-e. and in Li(P®.)) 




522 12. Applications 



Exercise 12.6.4: Let be an exchangeable sequence of ran- 

dom variables, taking values in some measurable space {E,£). Also let 
dfi oc e~^d\ be the Boltzmann-Gibbs measure associated to a reference 
measure A, and to a nonnegative potential function with A(e“^) > 0. Sup- 
pose we have, for any q< N, the following propagation-of-chaos estimate 

||Law(Xl-^...,X’-^)-/i®’||tv<e^(g) 

where lim^v-^oo ^n{q{N)) = 0, for some limiv-.^oo q{^) = co- 

• Let V\ be the A-essential infimum of V. Prove that for any 5 > 0, we 
have 

p (Ai<i<,(N)K(X‘-'^) >Vx + S)< eN{q{N))+{l-ft{V < 

• We consider the elementary 1-dimensional filtering model defined by 
the pair Markov chain 

r X„ = a„(X„_i)-hW„, Xo = Wo 

1 Vn = bn{Xn) + V„ 

where W„, V^n are iid Gaussian variables with common distribution 
A/’(0, 2/13), with /? > 0. Check that a version of the conditional distri- 
bution of W[o,„] = (Wo, . . . , W„) given yjo,„| = (lo, • • • , I'n) = y[o,n] 
is given by the formula 

dMn = dP (W( 0.„1 e . I yjo.n) = y(O.nl) « e"'’'""-'!--! dA„ (12.68) 

where An stands for the Lebesgue measure on and the potential 
function Ki,y(o,n) defined by 

^n,v,o.„,(«^[0,n]) = + ^(yp - bpix'^))^ 

p=0 p=0 

In the above display, represents the solution of the controlled sys- 
tem x);; = a„(i;^_i)-l-t«„, starting at xq = wq. Let I/’ = (WA^„)o<p<„, 
1 <i < N,he the genealogical tree model associated to the Feynman- 
Kac distribution (12.68). Using Theorem 8.3.3, check that 

||Law(e...,f/’)-M®ltv<^6(n) 

Prove that for any q{N) = o{l/y/N) we have the convergence in 
probability 



lim 

N-^oo 







References 



[1] A. de Acosta. On large deviations of empirical measures in the r- 
topology. J. Appl. Probab., 31A:41-47, 1994. Studies in applied prob- 
ability. 

[2] A. de Acosta. Projective systems in large deviation theory. II. Some 
applications. In Probability in Banach spaces, 9 (Sandjberg, 1993), 
volume 35 of Progr. Probab., pages 241-250. Birkhauser, Boston, 
1994. 

[3] A. de Acosta. Exponential tightness and projective systems in leurge 
deviation theory. In Festschrift for Lucien Le Cam, pages 143-156. 
Springer, New York, 1997. 

[4] D. Aldous and U. Vazirani, Go With the Winners Algorithms. In 
Proc. S5th Symp. Foundations of Computer Sci., pages 492-501, 1994. 

[5] J.-M. Alliot, D. Delahaye, J.-L. Farges, and M. Schoenauer. Genetic 
algorithms for automatic regrouping of air traffic control sectors. In 
J.R. McDonnell, R.G. Reynolds, and D.B. Fogel, editors. Proceedings 
of the 4th Annual Conference on Evolutionary Programming, pages 
657-672. MIT Press, Cambridge, 1995. 

[6] D.J. Amit, G. Parisi, and L. Peliti. Asymptotic behavior of the “true” 
self avoiding walk. Phys. Rev. B, 27:1635-1645, 1983. 

[7] C. Andrieu and A. Doucet. Optimal estimation of amplitude and 
phase modulated signals. Monte Carlo Methods Appl., 7(1-2):1-14, 
2001. 




524 References 



[8] C. Andrieu, A. Doucet, and W.J. Fitzgerald. An introduction to 
Monte Carlo methods for Bayesian data analysis. In Nonlinear Dy- 
namics and Statistics (Cambridge, 1998), pages 16^217. Birkhauser, 
Boston, 2001. 

[9] C. Andrieu, A. Doucet, W.J. Fitzgerald, and J.-M. P6rez. Bayesian 
computational approaches to model selection. In Nonlinear and Non- 
stationary Signal Processing (Cambridge, 1998), pages 1-41. Cam- 
bridge Univiversity Press, Cambridge, 2000. 

[10] C. Andrieu, A. Doucet, and E. Punskaya. Sequential Monte Carlo 
methods for optimal filtering. In Sequential Monte Carlo Methods 
in Practice, Statistics for Engineering and Information Science. Sci., 
pages 79-95. Springer, New York, 2001. 

[11] S. Asmussen and R.Y. Rubinstein. Steady state rare events simular 
tions in queueing models and its complexity properties. Advances in 
queueing, Probab. Stochastics Ser., pages 429-461, CRC, Boca Raton, 
FL, 1995. 

[12] R. Assaraf, M. Caffarel et A. Khelif, Diffusion Monte Carlo methods 
with a fixed number of walkers, Phys. Rev. E, vol. 61, no. 4, pp. 
4566-4575, 2000. 

[13] R. Atar, F. Viens, and O. Zeitoimi. Robustness of zakai’s equation 
via Feynman-Kac representations. In Q. Zhang. W.M. McEneaney, 
G. Yin, editors. Stochastic Analysis, Control, Optimization and Ap- 
plications, pages 339-352. Birkhauser, Boston, 1999. 

[14] K. B. Athreya and P. Jagers, editors. Classical and Modem Branch- 
ing Processes, volume 84 of The IMA Volumes in Mathematics and Its 
Applications. Papers from the IMA Workshop held at the University 
of Minnesota, Minneapolis, MN, June 13-17, 1994, Springer- Verlag, 
New York, 1997.. 

[15] R. Azencott. Grandes deviations et applications. In P.L. Hennequin, 
editor, ’Ecole d’ttt de Saint Flour VIII, Lecture Notes in Mathe- 
matics 774, pages 1-176. Springer- Verlag, Berlin, 1980. 

[16] B. Azimi-Sadaji and P.S. Krishnaprasad. Approximate nonlinear fil- 
tering and its applications for gps. Proceedings of 39th IEEE Con- 
ference on Decision and Control, 1579-84, Sydney, Australia, Dec. 
2000. 

[17] B. Azimi-Sadaji and P.S. Krishnaprasad. Change detection for non 
linear systems, a particle filtering approach. Proceedings of 2002 
American Control Conference, ACC2002. 




References 525 



[18] D.A. Bader, J.J., and R. Chellappa. Scalable data parallel algo- 
rithms for textrire synthesis and compression using Gibbs random 
fields. Technical Report CS-TR-3123 and UMIACS-TR-93-80, UMI- 
ACS and Electrical Engineering, University of Maryland, College 
Park, MD, 1993. 

[19] R.R. Bahadur and R. Ranga Rao. On deviations of the sample mean. 
Ann. Math. Stat, 31:1015-1027, 1960. 

[20] R.R. Bahadur and S.L. Zabell. Large deviations of the sample mean 
in general vector spaces. Ann. Probab., 7:587-621, 1979. 

[21] J. Baker. Adaptive selection methods for genetic algorithms. In 
J. Grefenstette, editor. Proceedings of the Intemationtd Conference 
on Genetic Algorithms and Their Applications. L. Erlbaum Asso- 
ciates, Hillsdale, NJ, 1985. 

[22] J. Baker. Reducing bias and ineflSciency in the selection algorithm. In 
J. Grefenstette, editor. Proceedings of the Second International Con- 
ference on Genetic Algorithms and Their Applications. L. Erlbaum 
Associates, Hillsdale, NJ, 1987. 

[23] A. Bakirtzis, S. Kazarlis, and V. Petridis. A genetic algorithm solu- 
tion to the economic dispatch problem. lEE Proceedings-C. Vol. 141, 
No. 4, pp. 377-382, July 1994. 

[24] Y. Bar-Shalom and T.E. Fortmann. Tmcking and Data Associations. 
Academic Press, New York, 1988. 

[25] P. Barbe and P. Bertail. The Weighted Bootstrap. Lecture Notes in 
Statistics 98. Springer- Verlag, Berlin, 1995. 

[26] U. Bastolla, H. FVauenkron, E. Gerstner, W. Nadler and P. Grass- 
berger. Testing a new Monte Carlo algorithm for protein folding. 
Proteins: Structure, Function and Genetics 32, 52-66 (1998). 

[27] N. Bellomo and M. Pulvirenti. Generalized kinetic models in applied 
sciences. In Modeling in Applied Sciences. Modeling and Simulation 
in Science, Engineering, and Technology, 1-19. Birkhauser, Boston, 
2000. 

[28] N. Bellomo and M. Pulvirenti, editors. Modeling in Applied Sciences. 
Modeling and Simulation in Science, Engineering, tind Technology. 
Birkhauser, Boston, 2000. 

[29] B. Berge, I.D. Chueshov, and P.A. Vuillermot. Solutions to certmn 
parabolic SPDE’s driven by Wiener processes. Stochastic Process. 
AppL, 92:237-263, 2001. 




526 References 



[30] A. Berreti and A.D. Sokai. J. Stat. Phys., 40(485), 1985. 

[31] L. Bertini and G. Giacomin. On the long-time behavior of the 
stochastic heat equation. Probab. Theory Related Fields, 114(3):279- 
289, 1999. 

[32] G. BirkhofF. Positivity and criticality PSAM, vol. 11, 111-126, 
(1957). 

[33] R. Bleck, C. Rooth, D. Hu, and L.T. Smith. Salinity-driven thermo- 
haline transients in a wind and thermohaline forced isopycnic coor- 
dinate model of the north atlantic. J. of Phys. Oceanogr., 22:1486- 
1515, 1992. 

[34] H.A.P. Blom and Y. Bar-Shalom. The interacting multiple model 
algorithm for systems with Markovian switching coefficients. IEEE 
Tmns. on Autom. Control, 38(3):780-783, 1998. 

[35] T. Bollerslev and P.E. Rossi. In P.E. Rossi, editor. Introduction to 
Modelling Stock Market Volatility. Bridging the Gap to Continuous 
Time. Academic Press, New York, 1996. 

[36] A.A. Borovkov. Boundary-value problems for random walks and 
large deviations in function spaces. Theory Probab. Appl, 12:575- 
595, 1967. 

[37] D. Brigo and B. Hanzon. On some filtering problems arising in math- 
ematical finance. The interplay between insurance, finance, and con- 
trol. Insurance Math. Econ., 22(l):53-64, 1998. 

[38] K. Burdzy, R. Holyst, and P. March. A Fleming-Viot particle rep- 
resentation of Dirichlet Laplacian. Commun. Math. Phys., 214:679- 
703, 2000. 

[39] B.P. Carlin, N.G. Poison, and D.S. Stoffer. A Monte-Carlo approach 
to nonnormal and nonlinear state-space modeling. J. Am. Stat. As- 
soc., 87(418):493-500, 1992. 

[40] R.A. Carmona and S.A. Molchanov. Parabolic Anderson model and 
intermittency. Mem. Am. Math. Soc. 108, no. 518, (1994). 

[41] R.A. Carmona, S.A. Molchanov, and F.G. Viens. Sharp upper bound 
on exponential behavior of a stochastic partial differential equation. 
Random Operators Stochastic Equations, 4(l):43-49, 1996. 

[42] R.A. Carmona and F. Viens. Almost-siu'e exponential behavior 
of a stochastic Anderson model with continuous space parameter. 
Stochastics Stochastics Rep., 62(3-4), 251-273, 1998. 




References 527 



[43] H. Carvalho, P. Del Moral, A. Monin, and G. Saint. Optimal nonlin- 
ear filtering in GPS/INS integration. IEEE 7hms. Aerosp. Electron. 
Syst, 33(3):835-850, 1997. 

[44] 0. Catoni. Rough large deviations estimates for simulated annealing: 
Application to exponential schedules. Ann. Probab., 20:1109-1146, 
1992. 

[45] R. Cerf. Asymptotic convergence of a genetic algorithm. C. R. Acad. 
Sci. Paris S6r. I Math., 319(3):271-276, 1994. 

[46] R. Cerf. A new genetic algorithm. C. R. Acad. Sci. Paris S4r. I 
Math., 319(9):999-1004, 1994. 

[47] R. Cerf. A new genetic algorithm. Ann. Appl. Probab., 6(3):778-817, 
1996. 

[48] R. Cerf. Asymptotic convergence of genetic algorithms. Adv. Appl. 
Probab., 30(2):521-550, 1998. 

[49] F. Cerou, P. Del Moral F. LeGland, and P. Lezaud. Genetic ge- 
nealogical models in rare event analysis. Publications du Laboratoire 
de Statistiques et Probabilites, Toulouse III, 2002. 

[50] M. Chaleyat-Maurel and D. Michel. Des r^ultats de non existence 
de filtres de dimension finie. C. R. Acad. Sc. de Paris Sirie I Math., 
296, no. 22, 933-936, 1983. 

[51] R. Chen, J.S. Liu, and W.H. Wong. Rejection control and sequential 
importance sampling. J. Am. Stat. Assoc., 93(443): 1022-1031, 1998. 

[52] S. Chib, S. Kim, and S. Shephard. Stochastic volatility: likelihood 
inference and comparison with ARCH models. Rev. Econ. Stud., 
65:361-394, 1998. 

[53] S. Chib, F. Nardari, and N. Shephard. Markov chain Monte-Carlo 
methods for generalized stochastic volatility models. J. of Econ. 108, 
281-316, 1998. 

[54] Y.S. Chow and H. Teichter. Probability Theory, Independence, Inter- 
changeability and Martingales, 2nd ed.. Springer Texts in Statistics, 
Springer- Verlag, New York, 1988. 

[55] K.L. Chung. A Course in Probability Theory. A Series of Mono- 
graphs 6md Textbook, 2nd Ed., Probability tmd Mathematical Statis- 
tics, vol. 21, Academic Press, New York, 1974. 

[56] G.A. Churchill. Stochastic models for heterogeneous DNA sequences. 
Bull. Math. Biol., 51:79-94, 1989. 




528 References 



[57] T.C. Clapp and S.J. Godsill. Fix lag smoothing using sequential im- 
portance sampling. In A.P. Dawid, J.M. Bernardo, J.O. Berger, and 
A.F.M. Smith, editors, Bayesian Statistics, pages 743-752. Oxford 
University Press, Oxford, 1999. 

[58] C.S. Clark. Multiple model adaptive estimation and control redis- 
tribution performance on the VISTA F-16 during partial actuator 
impairments. MS Thesis, School of Engineering, Air Force Institute 
of Technology, Wright-Patterson AFB, OH, 1997. 

[59] J.M.C. Clark, D.L. Ocone, *md C. Coumarbatch. Relative entropy 
and error bounds for filtering of Markov processes. Math. Control 
Signal Syst, 12(4):346-360, 1999. 

[60] J.E. Cohen, Y. Iwasa, G. Rautu, M.B. Ruskai, E. Seneta, and G. Zba- 
ganu. Relative entropy under mappings by stochastic matrices. Lin- 
ear Algebra Appl, 179:211-235, 1993. 

[61] F. Comets. Large deviations for a conditional probability distribu- 
tion. Applications to random interacting Gibbs measures. Probab. 
Theory Related Fields, 80:407-432, 1989. 

[62] H. Crdmer. Sur un nouveau theor^me limite de la theorie des prob- 
abilites. Act. Sci. et ind., 3:5-23, 1938. 

[63] D. Crisan, J. Gaines, and T.J. Lyons. A particle approximation of 
the solution of the Kushner-Stratonovitch equation. SIAM J. Appl. 
Math., 58(5):1568-1590, 1998. 

[64] D. Crisan and T.J. Lyons. Nonlinear filtering and measure valued 
processes. Probab. Theory Related Fields, 109:217-244, 1997. 

[65] D. Crisan and T.J. Lyons. A particle approximation of the solution of 
the Kushner-Stratonovitch equation. Probab. Theory Related Fields, 
115(4):54^578, 1999. 

[66] D. Crisan, P. Del Moral, and T.J. Lyons. Interacting particle sys- 
tems approximations of the Kushner-Stratonovitch equation. Adv. 
in Appl. Probab., 31(3):819-838, 1999. 

[67] D. Crisan, P. Del Moral, and T.J. Lyons. Non linear filtering using 
branching and interacting particle systems. Markov Processes Related 
Fields, 5(3):293-319, 1999. 

[68] I. Csiszar. Eine informationstheoretische Ungleichung und ihre An- 
wendung auf den Beweis der Ergodizitat von Markoffschen Ketten. 
Magyar Tad. Akad. Mat. Kutatd Int. Kozl, 8:85-108, 1963. 




References 529 



[69] I. Csiszax. Sanov property, generalized i-projection and a conditional 
limit theorem. Ann. Probab., 12(3):768-793, 1984. 

[70] D. DacunharCastelle. Formule de Chernov pour une suite de variables 
reelles. In Grandes deviations et Applications Statistiques, pages 19- 
24. Asterisque 68, Paris, 1979. 

[71] M. Davy, P. Del Moral, and A. Doucet. Methodes Monte-Carlo se- 
quentielles pour I’analyse spectrale bayesienne. In Proceedings of the 
GRETSI Conference, Paris 2003. 

[72] D. Dawson. Measure-valued Meurkov processes. In P.L. Hennequin, 
editor. Lectures on Probability Theory. Ecole d’Etd de ProbabUites de 
Saint-Flour XXI-1991, Lecture Notes in Mathematics 1541. Springer- 
Verlag, Berlin, 1993. 

[73] D. Dawson and J. Gartner. Large deviations from the McKean 
Vlasov limit for weakly interacting diffusions. Stochastics, 20:247- 
308, 1987. 

[74] D. Dawson and J. Gartner. Analytic aspects of multilevel large devi- 
ations. In Asymptotic Methods in Probability and Statistics (Ottawa, 
ON, 1997), pages 401-440. North-Holland, Amsterdam, 1998. 

[75] P. Del Moral. Non-linear filtering: interacting particle resolution. 
Markov Processes Related Fields, 2(4):555-581, 1996. 

[76] P. Del Moral. Measure valued processes and interacting particle 
systems. Application to nonlinear filtering problems. Ann. Appl. 
Probab., 8(2):438-495, 1998. 

[77] P. Del Moral and M. Doisy. Maslov idempotent probability calculus. 
Part I. Theory Probab. Appl, 43(4):735-751, 1998. 

[78] P. Del Moral and M. Doisy. Maslov idempotent probability calculus. 
Part II. Theory Probab. Appl, 44(2):384-400, 1999. 

[79] P. Del Moral and M. Doisy. On the applications of Maslov optimiza- 
tion theory. Math. Notes, 69(2):232-244, 2001. 

[80] P. Del Moral and A. Doucet. On a class of genealogical and inter- 
acting metropolis models. J. Azema, M. Emery, M. Ledoux, and 
M. Yor, editors, Seminaire de ProbabUitis XXXVII, Lecture Notes in 
Mathematics no. 1832, pp. 415-446. Springer- Verlag, Berlin, 2004. 

[81] P. Del Moral, A. Doucet, and G. Peters. Sequential Monte 
Carlo samplers. Technical Report, Cambridge University, CUED/F- 
INFENG/TR 443, Dec. 2002. 




530 References 



[82] P. Del Moral and A. Doucet. Particle motions in absorbing medium 
with hard and soft obstacles. To appear in Stochastic Analysis and 
Applications, 2004. 

[83] P. Del Moral and A. Guionnet. Large deviations for interacting par- 
ticle systems. Applications to nonlinear filtering problems. Stochastic 
Processes Appl, 78:69-95, 1998. 

[84] P. Del Moral and A. Guionnet. A central limit theorem for nonlin- 
ear filtering using interacting particle systems. Ann. Appl. Probab., 
9(2):275-297, 1999. 

[85] P. Del Moral and A. Guionnet. On the stability of measure valued 
processes with applications to filtering. C. R. Acad. Sc. de Paris 
Sine I Math., 329(5):429-434 (1999). 

[86] P. Del Moral and A. Guionnet. On the stability of interacting pro- 
cesses with applications to filtering and genetic algorithms. Ann. 
Inst Henri Poincari, 37(2):155-194, 2001. 

[87] P. Del Moral and J. Jacod. Interacting particle filtering with discrete 
observations. In N.J. Gordon, A. Doucet, £md J.F.G. de FVeitas, ed- 
itors, Sequential Monte-Carlo Methods in Practice. Springer- Verlag, 
New York, 2001. 

[88] P. Del Moral and J. Jacod. Interacting particle filtering with discrete- 
time observations: asymptotic behaviour in the Gaussian case. In 
Stochastics in Finite and Infinite Dimensions, IVends in Mathemat- 
ics, pages 101-122. Birkhauser, Boston, 2001. 

[89] P. Del Moral and J. Jacod. The Monte-Carlo method for filtering 
with discrete-time observations: Central limi t theorems. In Numeri- 
cal Methods and stochastics (Toronto, ON, 1999), volume 34 of Fields 
Inst. Commun., pages 29-53. American Mathematical Society, Prov- 
idence, RI, 2002. 

[90] P. Del Moral, J. Jacod, and P. Protter. The Monte Carlo method 
for filtering with discrete time observations. Probab. Theory Related 
Fields, 120:346-368, 2001. 

[91] P. Del Moral, M.A. Kouritzin, amd L. Miclo. On a class of discrete 
generation interacting particle systems. Electron. J. Pn>6a6., 6(16):1- 
26, 2001. 

[92] P. Del Moral, L. KaUel, and J. Rowe. Modeling genetic algorithms 
with interacting particle systems. Rev. Mat, Teoria apt, 8(2):19-78, 
2001. 




References 531 



[93] P. Del Moral, M. Ledoux, and L. Miclo. On contraction properties of 
Markov kernels. Probab. Theory Related Fields, 126:395-420, 2003. 

[94] P. Del Moral and L. Miclo. On the convergence and the applications 
of the generalized simulated annealing. SIAM J. Control Optim., 
37(4):1222-1250, 1999. 

[95] P. Del Moral and L. Miclo. About the strong propagation of chaos for 
interacting particle approximations of Feynman-Kac formulae. Pub- 
lications du Laboratoire de Statistique et Probability, no. 08-00, Uni- 
versite Paul Sabatier, Toulouse, FVance, 2000. 

[96] P. Del Moral and L. Miclo. Asymptotic results for genetic algorithms 
with applications to nonlinear estimation. In L. Kallel and B. Naudts, 
editors. Proceedings of the Second EvoNet Summer School on Theo- 
retical Aspects of Evolutionary Computing, Natural Computing Se- 
ries. Springer- Verlag, New York, 2000. 

[97] P. Del Moral and L. Miclo. Branching and interacting particle sys- 
tems approximations of Feynman-Kac formulae with applications to 
nonlinear filtering. In J. Azema, M. Emery, M. Ledoux, and M. Yor, 
editors, Seminaire de Probabilites XXXIV, Lecture Notes in Mathe- 
matics 1729, pages 1-145. Springer- Verlag, Berlin, 2000. 

[98] P. Del Moral and L. Miclo. A Moran particle system approximation 
of Feynman-Kac formulae. Stochastic Processes Appl, 86:193-216, 
2000. 

[99] P. Del Moral and L. Miclo. Genealogies and increasing propagation 
of chaos for Feynman-Kac and genetic models. Ann. Appl. Probab., 
11 (4): 1166-1 198, 2001. 

[100] P. Del Moral and L. Miclo. On the stability of non linear Feynman- 
Kac semi-groups. Annales de la Faculte des Sciences de Toulouse, 
11(2):135-175, 2002. 

[101] P. Del Moral and L. Miclo. Annealed Feynman-Kac models. Com- 
mun. Math. Phys., 235(2):191-214, 2003. 

[102] P. Del Moral and L. Miclo. Particle approximations of Lyapimov ex- 
ponents connected to Schrodinger operators and Feynman-Kac semi- 
groups. ESAIM: Probability and Statistics, no. 7, pp. 171-208, 2003. 

[103] P. Del Moral, J.C. Noyer, G. Rigal, and G. Salut. IVaitement par- 
ticulaire du signal radar, detection, estimation et reconnaissance de 
cibles aeriennes. Technical report, LAAS/CNRS, Toulouse, 1992. 




532 References 



[104] P. Del Moral, J.C. Noyer, and G. Salut. Risolution particulaire et 
traitement Don-lin4aire du signal : application radar/sonar. In Traite- 
ment du signal, (12):4, 287-301, 1995. 

[105] P. Del Moral, G. Rigal, and G. Salut. Elstimation et conunande op- 
timale non lineaire. Technical Report 2, LAAS/CNRS, Toulouse, 
March 1992. Contract DRET-DIGILOG. 

[106] P. Del Moral, G. Rigal, and G. Salut. Estimation et conunande op>- 
timale non-lin4aire : un cadre unifi4 pour la resolution particulaire. 
Technical report, LAAS/CNRS, Toulouse, 1992. Contract DRET- 
DIGILOG-LAAS/CNRS. 

[107] P. Del Moral, G. Rigal, 8md G. Salut. Filtrage non-lineaire non- 
gaussien appliqu4 au recalage de plates-formes inertieUes. Tech- 
nical report, LAAS/CNRS, Toulouse, 1992. STCAN/DIGILOG- 
LAAS/CNRS contract no. A.91.77.013. 

[108] P. Del Moral and G. Salut. Random particle methods in (max,-|-) op- 
timization problems. In J. Gunawardena, editor, Idempotency, Publi- 
cations of the Newton Institute, pages 383-392. Cambridge University 
Press, Cambridge, 1998. 

[109] P. Del Moral and T. Zajic. On Laplace- Varadhan’s integral lemma. 
C. R. Acad. Sci. Paris Serie I Math., 334(8) :693-698, 2002. 

[110] P. Del Moral and T. Zajic. A note on the Laplace- Varadhan integral 
lemma. Bernoulli, 9(l):49-65, 2003. 

[111] F. Dellaert, D. Fox, W. Burgard, and S. Thrun. Monte-Carlo localiza- 
tion for mobile robots. IEEE International Conference on Robotics 
and Automation, ICRA99, IEEE, New York, 1999. 

[112] A. Dembo and O. Zeitouni. Large Deviations Techniques and Appli- 
cation. Jones and Bartlett Publishers, Boston, 1993. 

[113] H. Derin. The use of Gibbs distributions in image processing. In 
Blake and H. V. Poor, editors. Communications and Networks, pages 
266-298. Springer- Verlag, New York, 1986. 

[114] J.-D. Deuschel and D.W. Stroock. Large Deviations. Pure and Ap- 
plied Mathematics 137. Academic Press, New York, 1989. 

[115] K.A. Dill T.C. Beutler. Protein Sci., 5(2037), 1996. 

[116] R.L. Dobrushin. Central limit theorem for nonstationnary Markov 
chains, i,ii. Theory of Probability and its Applications, 1(1 and 4):66- 
80 and 330-385, 1956. 




References 533 



[117] M.D. Donsker and R.S. Varadhan. Asymptotic evaluation of certain 
wiener integrals for large time. Functional integration and its appli- 
cations (Proc. Intemat. Conf., London, 1974), PP- 15-33. Clarendon 
Press, Oxford, 1975. 

[118] M.D. Donsker and S.R.S. Varadhan. Asymptotic evaluation of cer- 
tain Markov process expectations for large time, i. Commun. Pure 
Appl. Math., 28:1-47, 1975. 

[119] M.D. Donsker and S.R.S. Varadhan. Asymptotic evaluation of cer- 
tain Markov process expectations for large time, ii. Commun. Pure 
Appl. Math., 28:279-301, 1975. 

[120] M.D. Donsker and S.R.S. Varadhan. Asymptotic evaluation of cer- 
tain Markov process expectations for large time, iii. Commun. Pure 
Appl. Math., 29:389-461, 1976. 

[121] R. Douc, E. Moulines, and T. Ryden. Asymptotic properties of the 
maximum likelihood estimator in autoregressive models with Markov 
regime. Preprint ENST, Paris 2003. 

[122] A. Doucet and C. Andrieu. On sequential Monte Carlo sampling 
methods for Bayesiam filtering. Statistics and Computing, vol. 10, 
no. 3, pp. 197-208, 2000. 

[123] A. Doucet and C. Andrieu. Particle filtering for partially observed 
Gaussian state space models. J. R. Stat. Soc. Ser. B, Stat. Methodol, 
64(4):827-836, 2002. 

[124] A. Doucet, N. de FVeitas, and N. Gordon. An introduction to se- 
quential Monte Carlo methods. In Sequential Monte Carlo Methods 
in Practice, Statistics for Engineering and Information Science, pages 
3-14. Springer, New York, 2001. 

[125] A. Doucet, N. de FYeitas, and N. Gordon, editors. Sequential Monte 
Carlo Methods in Pratice. Statistics for engineering and Information 
Science. Springer, New York, 2001. 

[126] J. Dugundji. Topology. Prentice-Hall of India, New Delhi, 1975. 

[127] P. Dupuis and R.S. Ellis. A Weak Convergence Approach to the 
Theory of Large Deviations. Vol. 18, Wiley Series in Probability and 
Statistics, John Wiley & Sons, Chichester 2000. 

[128] E.B. Dynkin. An Introduction to Branching Measure- Valued Pro- 
cesses, vol. 6 of CRM Monograph Series. American Mathematical 
Society, Providence, RI, 1994. 




534 References 



[129] E.B. Dynkin and A. Mandelbaum. Symmetric statistics, Poisson pro- 
cesses and multiple Wiener integrals. Ann. Stat, 11:739-745, 1983. 

[130] S.F. Edwards. The statistical mechanics of polymers with excluded 
voliune. Proc. Phys. Sci., 85:613-624, 1965. 

[131] R.J. Elliott and J. van der Hoek. An application of hidden markov 
models to asset allocation problems. Finance Stochastics, 1:229-238, 
1997. 

[132] R.J. Elliott, L. Aggoun, and J.B. Moore. Hidden Markov models. 
Vol. 29, Applications of Mathematics, Springer- Verlag, New York, 
1995. 

[133] R.S. Ellis. Large Deviations and Statistical Mechanics. Springer- 
Verlag, New York, 1985. 

[134] A. Etheridge and P. March. A note on superprocesses. Prob. Th. 
Rel. Fields, 89:141-147, 1991. 

[135] G. Evensen. Sequential data assimilation with a nonlinear quasi- 
geotrophic model using Monte-Carlo methods to forecast error statis- 
tics. J. Geophys. Res., 99:143-162, 1994. 

[136] G. Evensen. Application of ensemble integrations for predictabihty 
studies and data assimilation. Monte-Carlo simulations in oceanogra- 
phy. Aha Huliko’a Hawsdian Winter Workshop, University of Hawaii 
at Manoa, 1997. 

[137] G. Evensen. Sequential data assimilation for nonlinear dynamics: 
The ensemble Kalman filter. Oceanographic Forecasting: Conceptual 
Basis and Applications, N. Pinardi and J.D. Woods, editors. Springer- 
Verlag, Berlin, Heidelberg, 2002. 

[138] G. Evensen and P.J. Van Leeuwen. Assimilation of geostat altimeter 
data for the agulhas current using an ensemble Kalman filter with a 
quasi-geotrophic model. Mon. Weather Rev., 124:85-96, 1996. 

[139] G. Evensen. The ensemble Kalman filter: Theoretical formulation 
and practical implementation. To appear in Ocean Dyn., 2003. 

[140] G. Evensen and V.E. Haugen. Assimilation of SLA and SST data 
into an OGCM for the Indian Oceam. Ocean Dyn., 52:133-151, 2002. 

[141] G. Evensen and V.E. Haugen. Indian Ocean circulation: An inte- 
grated model and remote sensor study. J. Geophys. Res., 107:11-23, 
2002. 




References 535 



[142] J.J. Ewell. Space shuttle orbiter entry through land navigation. 
IEEE International Conference on Intelligent Robots and Systems, 
pages 627-632, IEEE, New York, 1988. 

[143] C.M. Ewing N.J. Gordon and D.J. Salmond. Bayesian state estima- 
tion for tracking and guidance using the bootstrap filter. AIAA J. 
Guidance, Control Dyn., 18:1434-1443, 1995. 

[144] D. Fox, F. Dellaert, W. Burgard, and S. Thrun. Using the conden- 
sation algorithm for robust, vision-based mobile robot localization. 
In Proceedings of the IEEE International Conference on Computer 
Vision and Pattern Recognition, Fort Collins, CO, IEEE, New York, 
1999. 

[145] K.A. Fisher and P.S. Maybeck. Multiple model adaptive estimation 
with filter spawning. IEEE Trans. Aerosp. Electron. Syst, 38(3):755- 
769, 2002. 

[146] G.S. Fishman. Monte-Carlo, concepts, algorithms and applications. 
Springer Series in Operations Research. Springer- Verlag, New York, 
1996. 

[147] F. Fomari and A. Mele. Stochastic Volatility in Financial Markets - 
Crossing the Bridge to Continuous Time. Kluwer, Dordrecht, 2000. 

[148] D. Fox, S. Thrun, F. Dellaert, and W. Burgard. Particle filters for 
robot localization. In Sequential Monte- Carlo Methods in Practice, 
A. Doucet, N. de Freitas, and N. Gordon, editors. Springer- Verlag, 
New York, 2000. 

[149] H. Frauenkron, M.S. Causo, and P. Grassberger. Two-dimensional 
self-avoiding walks on a cylinder Phys. Rev., E 59, R16-R19 (1999). 

[150] D.R. FVedkin and J.A. Rice. Correlation functions of a finite-state 
process with application to channel kinetics. Math. Biosci., 87:161- 
172, 1987. 

[151] J.F.G. de Freitas, M. Niranjan, A.H. Gee, and A. Doucet. Sequen- 
tial Monte-Carlo methods to train neural networks models. Neural 
Comput., 12(4):955-993, 2000. 

[152] J. Gartner, W. Konig, and S.A. Molchanov. Almost sure asymptotics 
for the continuous parabolic Anderson model. Probab. Theory Related 
Fields, 118(4):547-573, 2000. 

[153] M.J.J. Gjuwel and D.P. Kroese. A comparison of restart implementa- 
tions. Proceedings of the 1998 Winter Simulation Conference, pages 
601-608, IEEE Computer society Press, Piscataway, New-Jersey, 
1998. 




536 References 



[154] S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions, 
and the bayesian restoration of images. IEEE Trans, on Pattern 
Anal. Mach. Intelligence, 6:721-741, 1984. 

[155] E. Ghysels, A.C. Harvey, and E. Renault. Stochastic Volatility. 
Statistical Methods in Finance. Handbook of Statistics 14, North- 
Holland, Amsterdam, 1996. 

[156] F. Le Gland, C. Musso, and N. Oudjane. An analysis of regularized 
interacting particle methods for nonlinear filtering. In Proceedings of 
the 3rd IEEE European Workshop on Computer-Intensive Methods in 
Control and Signal Processing, I. Rojicek, M. Valeckova, M. Karny, 
and K. Warwick, editors, pp. 167-174, Prague, 1998. 

[157] P. Glasserman, P. Heidelberger, P. Shahabuddin, and T. Zajic. Split- 
ting for rare event simulation: analysis of simple cases. Proceedings of 
the 1996 Winter Simulation Conference,pages 302-308, IEEE Com- 
puter society Press, Piscataway, 1996. 

[158] P. Glasserman, P. Heidelberger, P. Shahabuddin, and T. Zajic. A 
large deviations perspective on the efficiency of multilevel slipping. 
IEEE Trans, on Autom. Control, 43(12):1666-1679, 1998. 

[159] P. Glasserman, P. Heidelberger, P. Shahabuddin, and T. Zajic. Mul- 
tilevel splitting for estimating rare event probabilities. Oper. Res., 
47(4):585-600, 1999. 

[160] S. Godsill, A. Doucet, and M. West. Maximmn a posteriori sequence 
estimation using Monte Carlo particle filters. Ann. Inst. Stat. Math., 
53(l):82-96, 2001. 

[161] D.E. Goldberg. Genetic algorithms and rule learning in dynamic 
control systems. In Proceedings of the First International Conference 
on Genetic Algorithms, pages 8-15. L. Erlbaum Associates, Hillsdale, 
NJ, 1985. 

[162] D.E. Goldberg. Genetic Algorithms in Search, Optimization and Ma- 
chine Learning. Addison- Wesley, Reading, MA, 1989. 

[163] N.J. Gordon, D.J. Salmon, and C. Ewing. Bayesian state estimation 
for tracking and guidance using the bootstrap filter. J. Guidance 
Control Dyn., 18(6): 1434-1443, 1995. 

[164] N.J. Gordon, D.J. Salmon, and A.F.M. Smith. Novel approach to 
nonlinear/non-Gaussian Bayesian state estimation. lEE Proc. F, 
140:107-113, 1993. 




References 537 



[165] C. Graham and S. M41eard. Stochastic particle approximations for 
generalized Boltzmann models and convergence estimates. i4nn. 
Probab., 25(1):115-132, 1997. 

[166] C. Graham and S. M414ard. Probabilistic tools and Monte-Carlo 
approximations for some Boltzmann equations. In CEMRACS 1999 
(Orsay), volume 10 of ESAIM Proceedings, pages 77-126 (electronic). 
Soci4te de Math4matiques Appliqu4es et Industrielles, Paris, 1999. 

[167] P. Grassberger. Advanced sequential Monte-Carlo methods in 
physics. In Ed. H. Rollnik and D. WoUc, editors. NIC Series, Proceed- 
ings NIC Symposium, vol. 9, pages 1-12, John von Neuman Institute 
for Computing, Jiilich, 2002. 

[168] U. Grenander. Elements of pattern theory. The Johns Hopkins Un- 
versity Press. Baltimore and London, 1996. 

[169] A. Greven and F. den Hollander. A variational characterization of the 
speed of a one dimensional self-repellent random walk. Ann. Appl. 
Probab., 3:1067-1099, 1993. 

[170] P. Groeneboom, J. Oosterhoff, and F.H. Ruymgaart. Large deviation 
theorems for empirical probability measures. Ann. Probab., 7(4):553- 
586, 1979. 

[171] J.D. Hamilton. A new approach to the economic analysis of nonstar 
tionary time series amd the business cycle. Econometrica, 57:357-384, 
1989. 

[172] J.D. Hamilton. Analysis of time series subject to changes in regime. 
J. Economet, 45:39-70, 1990. 

[173] T. Hara and G. Slade. The lace expansion for self avoiding walk in 
five or more dimensions. Rev. Math. Phys., 4:235-327, 1992. 

[174] G.H. Hardy, J.E. Littlewood, and G. P61ya. Inequalities. Cam- 
bridge Mathematical Library. Cambridge University Press, Cam- 
bridge, 1988. Reprint of the 1952 edition. 

[175] T.E. Harris. Some mathematical models for branching processes. 
In Proceedings of the Second Berkeley Symposium on Mathematical 
Statistics and Probability, 1950, pages 305-328. University of Califor- 
nia Press, Berkeley and Los Angeles, 1951. 

[176] T.E. Harris. The Theory of Branching Processes. Die Grundlehren 
der Mathematischen Wissenschaften, Bd. 119. Springer- Verlag, 
Berlin, 1963. 




538 References 



[177] T.E. Harris and H. Kahn. Estimation of particle transmission by 
random sampling. Natl. Bur. Stand. Appl. Math. Ser., 12:27-30, 
1951. 

[178] W.K. Hastings. Monte-Carlo sampling methods using Markov chains 
and their applications. Biometrika, 57:97-109, 1970. 

[179] J.H. Hetherington, Observations on the statistical iteration of matri- 
ces, Physical Review A, vol. 30, no. 5, pp. 2713-2719, 1984. 

[180] C. C. Heyde, editor. Branching Processes, volume 99 of Lecture Notes 
in Statistics. Springer- Verlag, New York, 1995. 

[181] T. Higushi. Monte-Cau-lo filter using the genetic algorithm operators. 
J. Stat. Comput. Simulation, 59(l):l-23, 1997. 

[182] T. Higushi. Self-organizing time series model. In N.J. Gordon 
A. Doucet, J.F.G. de Freitas, editors. Sequential Monte-Carlo Meth- 
ods in Practice, pages 428-444. Springer- Verlag, New York, 2001. 

[183] J.H. HoUand. Adaptation in Natural and Artificial Systems. Univer- 
sity of Michigan Press, Ann Arbor, 1975. 

[184] C. Hue, J.P. Le Cadre, and P. P4rez. A particle filter to track multiple 
objects. IEEE Trans. Aerosp. and Electron. Syst., 38(3):791-812, 
2002. 

[185] M. Hurzeler tmd H.R. Kiinch. Monte-Carlo approximations for gen- 
eral state space models. J. Comput. Graphical Stat., 7(2):175-193, 
1998. 

[186] D. Ingerman K. Burdzy, R. Holyst and P. March. Configurational 
transition in a fieming-viot-type model find probabilistic interpreta- 
tion of laplacian eigenfunctions. J. Phys., A 29:2633-2642, 1996. 

[187] M. Isard and A. Blake. Contour tracking by stochastic propagation 
of conditional densities. Computer Vision, ECCV’96, B. Buxton and 
R. Cipolla, editors. Springer- Verlag, New York, 1996. 

[188] M. Isard and A. Blake. Condensation-conditionsJ density propagar 
tion for visual tracking. Int. J. Comput. Vision, 29(l):5-28, 1998. 

[189] J. Jacod and A.N. Shiryaev. Limit Theorems for Stochastic Pro- 
cesses. Series of Comprehensive Studies in Mathematics 288. 
Springer- Verlag, New York, 1987. 

[190] B.H. Juang and L.R. Rabiner. Hidden Markov models for speech 
recognition. Technometrics, 33:251-272, 1991. 




References 539 



[191] J.M. Johnson and Y. Rahmat-Samii. Genetic algorithms in electro- 
magnetics. In IEEE Antennas and Propagation Society International 
Symposium Digest, volume 2, pages 1480-1483. IEEE, New York, 
1996. 

[192] F. Jouve, L. Kallel, and M. Schoenauer. Mechanical inclusions identi- 
fication by evolutionary computation. Eur. J. Finite Elements, 5(5- 
6):619-648, 1996. 

[193] F. Jouve, L. Kallel, jmd M. Schoenauer. Identification of mechani- 
cal inclusions. In D. Dagsgupta and Z. Michalewicz, editors. Evolu- 
tionary Computation in Engineering, pages 477-494. Springer- Verlag, 
New York, 1997. 

[194] M. Kac. On distributions of certain wiener functionals. TVtms. i4m. 
Math. Soc., 65:1-13, 1949. 

[195] G. Kallianpur and C. Striebel. Stochastic diflferential equations oc- 
curring in the estimation of continuous parameter stochastic pro- 
cesses. Tech. Rep. 103, Department of Statistics, University of Min- 
nesota, Minneapolis, 1967. 

[196] R.E. Kalman. A new approach to linear filtering and prediction prob- 
lems. ASME Trans., J. Basic Engineering, 82(D):35-50, 1960. 

[197] R.E. Kalman and R.S. Bucy. New results in linear filtering and pre- 
diction. ASME Trans., J. Basic Engineering, 83(D):95-108, 1961. 

[198] A. Kaneko emd J.H. Park. Assimilation of coastal acoustic tomog- 
raphy data into a barotropic ocean model. Geophys. Res. Lett., 
27:3373-3376, 2000. 

[199] K. Karplus, C. Barrett, and R. Hughey. Hidden Markov models 
for detecting remote protein homologies. Bioinformatics, 14(10):846- 
856, 1998 

[200] T. Kato. Perturbation Theory for Linear Operators. Classics in 
Mathematics. Springer- Verlag, Berlin, Heidelberg, New York, 1980. 

[201] A.I. Khintchin. Uber einen neuen grenzwertstatz der wahrschein- 
lichkeitsrechnung. Math. Ann., 101:745-752, 1929. 

[202] S.J. Kim and R.A. litis. Performance comparison or particle and 
extended Kalman filters algorithms for GPS c/a code tracking and 
interference rejection. Conference on Information Sciences and Sys- 
tems. Princeton University, 2002. 

[203] M. Kimmel and D.E. Axelrod. Branching Processes in Biology, vol- 
ume 19 of Interdisciplinary Applied Mathematics. Springer- Verlag, 
New York, 2002. 




540 References 



[204] G. Kitagawa. Monte-Carlo filter and smoother for non-Gaussian non- 
linear state space models. J. Comput. and Graphical Stat, 5(l):l-25, 
1996. 

[205] V.N. Kolokoltsov and V.P. Maslov. Idempotent Analysis and Its Ap- 
plications, volume 401 of Mathematics and its Applications. Kluwer 
Academic Publishers Group, Dordrecht, 1997. Translation of Idempo- 
tent Ancdysis and Its Application in optimal control (Russian), Nauka 
Moscow, 1994, with an appendix by P. Del Moral. 

[206] T. Koski. Hidden Markov Models for Bioinformatics, volume 2 of 
Computational Biology Series. Kluwer Academic Publishers, Dor- 
drecht, 2001. 

[207] J.H. Kotecha and P.M. Djuric. Sequential Monte-Carlo sampling 
detector for Rayleigh fast-fading channels. Proceedings of the IEEE 
International Conference on Acoustics, Speech and Signal Processing, 
Istanbul, Turkey, Springer- Verlag, New York, 2000. 

[208] M.G. Krein and M.A. Rutman. Linear operatqrs leaving invariant a 
cone in a Banach space. American Mathematical Society Translation, 
no. 26, 1950. 

[209] K. Kremer and K. Binder. Monte carlo simulation of lattice models 
for macromolecules. Comput. Phys. Rep., 1988. 

[210] A.K. Kron, O.B. Ptitsyn, A.M. Skvortsov, amd A.K. Fedorov. Molec. 
Biol., 1(487), 1967. 

[211] S. Kullback and R.A. Leibler. On information 6uid sufficiency. Ann. 
Math. Stat, (22):79-86, 1951. 

[212] C. Kwok, D. Fox, and M. Meila. Adapatative real time particle filters 
for robot localization. Proceedings of the 2003 IEEE International 
Conference on Robotics Automation Taipei, Taiwan, 2003. 

[213] D. Lamberton and B. Lape)rre. Introduction to Stochastic Calculus 
Applied to Finance. Chapman and Hall, London, 1996. 

[214] G. Lawler. Intersections of Random Walks. Probability and Its Ap- 
plications. Birkhaiiser, Boston, 1991. 

[215] C.E. Lawrence, S.F. Altschul, M.S. Bogouski, J.S. Liu, A.F. Neuwald, 
and J.C. Wooten. Detecting subtle sequence signals: A Gibbs sam- 
pling strategy for multiple rdignment. Science, 262:208-214, 1993. 

[216] A.R. Leach. Molecular Modeling, Principles and Applications. 
Longman-Harlow, London, 1996. 




References 541 



[217] M. Ledoux and M. Talagr&ad.Probabtlity in Banach spaces. Springer- 
Verlag, New York, 1991. 

[218] P.J. Van Leeuwen and G. Evensen. Data assimilation and inverse 
methods in terms of a probabilistic formulation. Mon. Weather Rev., 
124:2898-2913, 1996. 

[219] F. LeGland and N. Oudjane. Stability and uniform approximation of 
nonlinear filters using the Hilbert metric, and application to particle 
filters, to appear in The Annals of Applied Probability (2004). 

[220] F. LeGland and N. Oudjane. A robustification approach to stability 
and to uniform particle approximation of nonlinear filters: The exam- 
ple of pseudo-mixing signals. Stochastic Processes Appl., 106(2) :279- 
316, 2003. 

[221] A.J. Leigh and V. Krishnamurthy. An improvement to the interact- 
ing multiple model algorithm. IEEE Trans, on Signal Processing, 
49(12):2909-2923, 2001. 

[222] D. Lerro and Y. Bar-Shalom. Interacting multiple model tracking 
with target amplitude feature. IEEE Trans. Aerosp. Electron. Syst, 
29(2):494-508, 1993. 

[223] J. Li and R.M. Gray. Image Segmentation and Compression Using 
Hidden Markov Models. Kluwer Academic Publishers, Dordrecht, 
2000. 

[224] X.R. Li. Multiple model with variable structure: Model group switch- 
ing algorithm. Proceedings of the 36th Conference on Decision and 
Control, San Diego, CA, pages 3114-3119. 1997. 

[225] X.R. Li and Y. Bar-Shalom. Multiple model estimation with variable 
structure. IEEE Trans. Autom. Control, 41(4):479-493, 1996. 

[226] F. Liang and W.H. Wong. Evolutionary Monte Carlo for protein 
folding simulations. J. Chem. Phys., 115 (7), pp. 3374-3380, 2001. 

[227] J.S. Liu. Monte-Carlo Strategies in Scientific Computing. Springer 
Series in Statistics, Springer, New York,2001. 

[228] J.S. Liu and R. Chen. Sequential Monte-Carlo methods for dynamic 
systems. J. Am. Stat. Assoc., 93(443):1032-1044, 1998. 

[229] J.S. Liu and S. Jensen. Computational discovery of gene regulatory 
binding motifs: A bayesian perspective. Tech. Rep. Department, of 
Statistics, Harvard University, Cambridge, 2003. 




542 References 



[230] J.S. Liu, A. Kong, and W.H. Wong. Sequential imputation method 
and Bayesian missing data problems. J. Am. Stat. Assoc., 89:278- 
288, 1994. 

[231] J.S. Liu, S. Kou, and S. Xie. Bayesian analysis of single molecule 
experiments. Tech. Rep., Department, of Statistics, Harvard Univer- 
sity, Cambridge, 2003. 

[232] J.S. Liu and C.E. Lawrence. Bayesian inference on biopolymer mod- 
els. Bioinformatics, 15:38-52, 1999. 

[233] J.S. Liu and T. Logvinenko. Bayesian methods in biological sequence 
analysis. Handbook of Statistical Genetics, 2nd ed. D.J. Balding, M. 
Bishop, and C. C anning s, editors. Wiley, Chichester, 2003. 

[234] J.S. Liu, A.F. Neuwald, and C.E. Lawrence. Bayesian models for 
multiple local sequence alignment and Gibbs sampling strategies. J. 
Am. Stat. Assoc., 90(432):1156-1170, 1995. 

[235] J.S. Liu and J.Z. Zhang. A new sequential importance sam- 
pling method and its applications to the 2-dimensional hydrophobic- 
hydrophilic model. J. Chem. Phys., 117(7), pp. 3492-3498, 2002. 

[236] L. Ljung. System Identification, Theory for the User. Prentice Hall 
Information and System Sciences Series. Prentice Hall, Englewood 
Cliffs, NJ, 1987. 

[237] I. L. MacDonald and W. Zucchii. Hidden Markov and other Models 
for Discrete-Valued Time Series. Chapman and Hall, London, 1997. 

[238] D.D. Magill. Optimal adaptive estimation of sampled stochastic pro- 
cesses. IEEE Trans, on Autom. Control, 10(4):434-439, 1965. 

[239] A.D. Marrs, N.J. Gordon, and D.J. Salmon. Sequential analy- 
sis of nonlinear dynamic systems using particles and mixtures. In 
P.C. Yoxmg, W.J. Fitzgerald, A. Walden, and R.L. Smith, editors. 
Nonlinear and Nonstationary Signal Processing. Cambridge Univer- 
sity Press, Cambridge, 2001. 

[240] V.P. Maslov. Methodes operatorielles. Edition Mir, Moscow, 1987. 

[241] P.S. Maybeck and R.I. Suizu. Adaptive tracker field of view variation 
via multiple model filtering. IEEE Trans. Aerosp. Electron. Syst, 
21(4):529-537, 1985. 

[242] V. Melik-Alaverdian and M.P. Nightingale, Quantum Monte Carlo 
methods in statistical mechanics, Intemat. J. of Modem Phys. C, 
vol. 10, no. 8, pp. 1409-1418, 1999. 




References 543 



[243] H.P. McKean, Jr. AclassofMarkovprocessesassociated with nonlin- 
ear parabolic equations. Proc. Natl. Acad. Sci. t/.5. A., 56:1907-1911, 

1966. 

[244] H.P. McKean, Jr. Propagation of chaos for a class of non-linear 
parabolic equations. In Stochastic Differential Equations (Lecture Se- 
ries in Differential Equations, Session 7, Catholic University, 1967), 
pages 41-57. Air Force Office of Scientific Research, Arlington, VA, 

1967. 

[245] S. M^l^ard. Asymptotic behaviour of some interacting particle sys- 
tems; McKean-Vlasov and Boltzmann models. In D. Talay and 
L. Tubaro, editors. Probabilistic Models for Nonlinear Partial Differ- 
ential Equations, Montecatini Terme, 1995, Lecture Notes in Math- 
ematics 1627. Springer- Verlag, Berlin, 1996. 

[246] S. Meleard. Convergence of the fluctuations for interacting diffu- 
sions with jumps associated with Boltzmann equations. Stochastics 
Stochastics Rep., 63(3-4):195-225, 1998. 

[247] S. M414ard. Probabilistic interpretation and approximations of some 
Boltzmann equations. In Stochastic models (Spanish) (Guanajuato, 
1998), volume 14 of Aportaciones Mat. Investig., pages 1-64. Soc. 
Mat. Mexicana, Mexico, 1998. 

[248] S. Meleard. Stochastic approximations of the solution of a full Boltz- 
mann equation with small initial data. ESAIM Probab. Stat, 2:23-40, 
1998. 

[249] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, E.Teller, 
A.H. Teller. Equation of state calculations by fast computing ma- 
chines. J. Chem. Phys., 90:233-241, 1953. 

[250] S.P. Meyn and R.L. Tweedie. Markov Chains and Stochastic Sta- 
bility. Communications and Control Engineering Series, Springer- 
Verlag London Ltd., London, 1993. 

[251] J.E. Moyal. The general theory of stochastic population processes. 
Acta Math., 108:1-31, 1962. 

[252] J.E. Moyal. Multiplicative population chains. Proc. R. Soc. Ser. A, 
266:518-526, 1962. 

[253] C. Musso and N. Oudjane. Regularized particle schemes applied to 
the tracking problem. In International Radar Symposium, Munich, 
Proceedings, September 1998. 

[254] M. Nagasawa. Stochastic Processes in Quantum Physics. Mono- 
graphs in Mathematics, vol. 94. Birkhauser- Verlag, Boston, 1991. 




544 References 



[255] D.B. Nelson. Arch models as difiFusion approximations. J. 
Economet, 45(l-2):7-38, 1990. 

[256] A.F. Neuwald, J.S. Liu, and C.E. Lawrence. Gibbs motif sampling: 
Detection of bacterial outer membrane repeats. Protein Sci., 4:1618- 
1632, 1995. 

[257] J.N. Nielsen and M. Vestergaard. Estimation in continuous time 
stochastic volatility models using nonlinear filters. Int. J. Theor. 
Appl. Finance, 3(2):279-308, 2000. 

[258] A. Nbc and M.D. Vose. Modelling genetic algorithms with Markov 
chains. Ann. Math. Artificial Intelligence, 5:79-88, 1991. 

[259] P-J. Nordlund, F. Gunnarsson and F. GustaCsson. P*ui;icle filters for 
positioning in wireless networks. Proceedings of EUSIPCO, Toulouse, 
FYance, 2002. 

[260] D. Ocone. Entropy inequalities and entropy dynamics in nonlinear 
filtering of diffusion processes. In Stochastic Analysis, Control, Opti- 
mization and Applications, Systems Control Foundations and Appli- 
cations, pages 477-496. Birkhauser, Boston, 1999. 

[261] P. Shahabuddin, P. Glasserman, P. Heidelberger, and T. Zajic. Mul- 
tilevel splitting for estimating rare event probabilities. Oper. Res., 
47(4):585-600, 1999. 

[262] E. Pardoux. Filtrage non lineaire et Equations aux derives partielles 
stochastiques associees. In P.L. Hennequin, editor, Ecole d’Etd de 
ProbabUitds de Saint-Flour XIX- 1989, Lecture Notes in Mathematics 
1464. Springer- Verlag, BerUn, 1991. 

[263] M. Peinado. Go with the winners edgorithms for cliques in random 
graphs. Algorithms Comput., 2223:525-536, 2001. 

[264] M. Peinado and T. Lengauer. Go with the winners generators with 
applications to molecular modeling. Random. Approx. Tech. Comput. 
Sci., 1269:135-149, 1997. 

[265] E.A. Perkins. Conditional Dawson- Watanabe processes and Fleming- 
Viot processes. Seminar in Stochastic Processes, pages 142-155, 

1991. 

[266] D.T. Pham. Stochastic methods for sequential data assimilation 
in strongly nonlinear systems. Mon. Weather Rev., 129:1194-1207, 

1992. 

[267] M.S. Pinsker. Information and Information Stability of Random 
Variables and Processes. Holden Day, San FVancisco, 1964. 




References 545 



[268] R.G. Pinsky. Positive Harmonic Functions and Diffusions, an In- 
tegrated and Analytic Approach. Cambridge Studies in Advanced 
Mathematics, 45. Cambridge University Press, Cambridge, 1995. 

[269] M.K. Pitt and N. Shephard. Filtering via simulation: Auxiliary par- 
ticle filters. J. Am. Stat. Assoc., 93(443):1022-1031, 1998. 

[270] M.K. Pitt and N. Sheppard. Filtering via simulation: auxiliary par- 
ticle filters. J. Am. Stat. Assoc., 94, 590-599, 1999. 

[271] D. Pollard. Convergence of Stochastic Processes. Springer Verlag, 
New York, 1984. 

[272] A. Puhalskii. On functional principle of large deviations. New Trends 
Probab. Stat, 1:198-219, 1991. 

[273] M. Pulvirenti. Kinetic limits for stochastic particle systems. In Prob- 
abilistic Models for Nonlinear Partial Differential Equations (Monte- 
catini Terme, 1995), volume 1627 of Lecture Notes in Mathematics, 
pages 96-126. Springer, Berlin, 1996. 

[274] E. Punskaya, A. Doucet, and W.J. Fitzgerald. Particle Filtering for 
Joint Symbol and Code Delay Estimation in DS Spread Spectrum 
Systems in Multipath Environment. To appear in J. Applied Signal 
Processing, 2004. 

[275] L.R. Rabiner A tutorial on hidden Markov models and selected ap- 
plications in speech recognition. Proc. IEEE, 77(2):257-285, 1989. 

[276] S.T. Rachev. Probability Metrics and the Stability of Stochastic Mod- 
els. Wiley, New York, 1991. 

[277] M. Reed and B. Simon. Methods of Modem Mathematical Physics, 
II, Fourier Analysis, Self Adjointness. Academic Press, New York, 
1975. 

[278] S. I. Resnick. Adventures in Stochastic Processes. Birkhauser, 
Boston, 1994. 

[279] D. Revuz. Markov Chains. North Holland, Amsterdam, 1984. 

[280] M.N. Rosenbluth and A.W. Rosenbluth. Monte-carlo calculations of 
the average extension of macromolecular chains. J. Chem. Phys., 
23:356-359, 1955. 

[281] I.N. Sanov. On the probability of large deviations of random vari- 
ables. Select. lYansl. Math. Statist, and Probability, Vol. 1 pp. 213- 
244, Inst. Math. Statist, and Amer. Math. Soc., Providence, R.I., 
1961. 




546 References 



[282] D. Schultz, W. Burgard, D. Fox, and A.B. Cremers. People track- 
ing with a mobile robot using sample based joint probabilistic data 
association filters. Int. J. Robotics Res., (22)2, 2003. 

[283] E.A. Semerdjiev and L.S. Mihaylova. Adaptative IMM algorithm 
for manouevring ship tracking. Proceedings of the first International 
Conference on Multisource-Multisensor Information Fusion (FU- 
SION’98), Las Vegas, Nevada, volume 2, pages 974-979, C.S.R.E.A. 
Press, Athens, Georgia, 1998. 

[284] E.A. Semerdjiev, L.S. Mihaylova, and Tz. Semerdjiev. Manouevring 
ship model identification and imm tracking algorithm design. 
Proceedings of the first International Conference on Multisource- 
Multisensor Information Fusion (FUSION’98), Las Vegas, Nevada, 
volume 2, pages 968-973, C.S.R.E.A. Press, Athens, Georgia, 1998. 

[285] J. Shapcott. Index tracking: Genetic algorithms for investment 
portfolio selection. Technical Report SS92-24, EPCC, Eklinburgh, 
September 1992. 

[286] T. Shiga and H. Tanaka. CentrallimittheoremforasystemofMarko- 
vian particles with mean field interaction. Z. Wahrschein. Verwandte 
Gebiete, 69:439-459, 1985. 

[287] A.N. Shiryaev. Probability, second edition. Volume 95 in Graduate 
Texts in Mathematics. Springer- Verlag, New- York, 1996. 

[288] G.R. Shorack. Probability for Statisticians. Springer Texts in Statis- 
tics, Springer, New York, 2000. 

[289] D. Siegmund. Sequential Analysis: Tests and confidence intervals. 
Springer Verlag, New York, 1985. 

[290] N. Smirnoff. Uber wahrscheinlichkeiten grosser abweichungen. Rec. 
Soc. Math. Moscow, 40:441-455, 1933. 

[291] D.J. Spielgelhalter, W.R. Gilks, and S. Richardson. Monte Carlo 
Markov Chain in Practice. Chapman and Hall, London, 1996. 

[292] R.F. Stengel. Optimal Control and Estimation. Dover Publications 
Inc., New York, 1986. 

[293] D.W. Stroock. An Introduction to the Theory of Large Deviations. 
Springer- Verlag, Berlin, 1984. 

[294] A.S. Sznitman. Topics in propagation of chaos. In P.L. Hennequin, 
editor, Ecole d’Etd de ProbabUites de Saint-Flour XIX-1989, Lecture 
Notes in Mathematics 1464. Springer-Verlag, Berlin, 1991. 




References 547 



[295] A.S. Szoitman. Brownian Motion Obstacles and Random Media. 
Springer- Verlag, Monographs in Mathematics, New York, 1998. 

[296] S. Tindel and F. Viens. Convergence of a branching particle system 
to the solution of a parabohc SPDE on the circle. Random Oper. 
Stochastic Equations, (to appear), 2003. 

[297] P. Torma and Cs. Szepesvri. Towards facial pose tracking. In Proc. 
First Hungarian Computer Graphics and Geometry Conference Bu- 
dapest, Himgary, pp. 10-16, 2002. 

[298] J. K. Townsend, Z. Haraszti, J. A. Freebersyser, and M. Devetsiki- 
otis. Simulation of rare events in communication networks. IEEE 
Commun. Mag., Vol. 36, No. 8, pages 36-41, 1998. 

[299] D. TVeyer, D.S. Weile, and E. Michielsen. The application of novel 
genetic algorithms to electromagnetic problems. In Applied Compu- 
tational Electromagnetics, Symposium Digest, volmne 2, pages 1382- 
1386, Monterey, CA, March 1997. 

[300] S.R.S. Varadhan. Asymptotic probabilities and differentied equar 
tions. Commun. Pure Appl. Math., 19:261-286, 1966. 

[301] A.D. Ventcel and M.I. Freidlin. On small perturbations of dynamical 
systems. Russian Math. Surveys, 25:1-55, 1970. 

[302] F. Viens. Portfolio optimization under peirtially observed stochastic 
volatility. In COM- CON 8. The 8th International Conference on 
Advances in Communication and Control. W. Wells, editor, pages 
1-12. Optim. Soft., Inc., 2002. 

[303] M. Villen-Altamirano, A. Martinez-Marron, J. Gamo, and F. 
Femandez-Questa. Enhancements of the accelerated simulation 
method restart by considering multiple thresholds. In Proceedings of 
the 14 th International Teletraffic Congress. The Fundamental Role 
of Teletraffic in the Evolution of the Telecommunication Networks. 
J. Labetoulle and J.W. Roberts, editors. Elsevier Science Publishers, 
Amsterdam, pages 797-810, 1994. 

[304] M. Villen-Altamirano and J. Villen-Altamirano. Restart: a method 
for accelerating rare event simulation. In Proceedings of the 13th In- 
ternational Teletraffic Congress. In Queueing Performance and Con- 
trol in ATM, J.W. Cohen and C.D. Pack, editors. Elsevier Science 
Publishers, Amsterdam, pages 71-76, 1991. 

[305] M. Villen-Altamirano and J. Villen-Altamirano. Restart: a straight- 
forward method for fast simulation of reure events. Proceedings of the 
1994 Winter Simulation Conference, pages 282-289. IEEE Computer 
Society Press, Piscataway, NJ, 1994. 




548 References 



[306] M. D. Vose. The Simple Genetic Algorithm, Foundations and Theory. 
The MIT Press Books, Cambridge, 1999. 

[307] F.T. Wall and J.J. Erpenbeck. J. Chem. Phys., 30:634-637, 1959. 

[308] F.T. Wall and F. Mandel. Macromolecular dimensions obtained by 
an efficient Monte Carlo method without sample attrition. J. Chem. 
Phys., Vol 63(11) pp. 4592-4595, 1975. 

[309] D. Whitley. A genetic algorithms tutorial. Statistics and Computing, 
(4):65-85, 1994. 

[310] D. Whitley. An Overview of Evolutionary Algorithms, J. Informa- 
tion and Software Technology, 43:817-831, 2001 

[311] A.N. Van der Vaart and J.A. Wellner. Weak Convergence and Em- 
pirical Processes with Applications to Statistics. Springer Series in 
Statistics. Springer, New York, 1996. 

[312] H. Wieland. Unzerlegbare, nicht negative Matrizen . Math. Z., vol. 
52, 642-648, 1950. 

[313] G. Wrinkler. Image Analysis, Random Fields and Markov Chain 
Monte Carlo Methods, a Mathematical Introduction 2nd edition. Ap- 
plications of mathematics Series 27, Springer- Verlag, New York, 2003. 

[314] Y.M. Zhang and X.R. Li. Detection and diagnostic of sensor and ac- 
tuator failures using IMM estimation. IEEE Trans. Aerosp. Electron. 
Syst, 34(4):1295-1312, 1998. 




Index 



{E,€),7 



(f;„,5n),48 

256 

256 



(©^, 


©)-integral operator, 


(«)p. 


222 


E^n,7l 


E'ip^n 


1 , 10 


^[O.nl 


, 459 


•^[p.n] 


,10 




258 


Gp,n, 


89 




I /), 122 


En,rj, 


30 


r(i) 


35 




35 


M{f] 


1,9 


M", 


10 


M 1 M 2 , 9 


^p,n\ 


,88 


p(q) 

^p,n<i 


258 


*^,n> 


89 


Vp,n? 


258 



Qn, 14 



Qp,ni 88 

4?l 89 
5n,»7> 31, 73 
V+, 35 
V~, 35 
■^(0,n]i 459 
B^{U), 381 
Bb(E), 7 
Bb{U), 364 
BbiUn), 380 
E[x)(.), 83 
E^(.), 58 

Ep,/ip) 88 

Ex(.), 58 
227 

^ 00,51 

K„,n. 74 
K„,74 
-Cf , 35 

114, 255 

114, 255 
Pp(.). 58 
Pp,xp> 87 
Px(.). 58 




550 Index 



V{U\ 364 
V{Un). 380 
V^{E), 336 
380 
60, 70 
P{En), 379 
P"(£;), 379 
^n,70 
^p.n, 89 
13, 31, 61 
^p,ni 133 

Qn, 11 

Qt, 11 

e«, 255 
e’,„, 255 
Zn, 11, 58, 63 



2t, 11 

Bb{E), 7 
p{M), 127 
pm, 132 
PHm, 132 
€p,„(G), 139 
31, 111 
Vt’, 36 
VT, 111 
Vn, 63 

■qn, 12, 30, 59, 88 



Vt, 12 

31, 111 



7^, 36 



In , 



111 



7 ;, 63 

7 „, 12, 30, 59, 61, 88 



7t, 12 

m, 24 

(g,AT}, 256, 267 
{q), 256 
Enty(., .), 365 
Osci(£), 8 
osc(/), 8 
A»M, 9 
an, 138 
7 

r-topology, 337 
Ti-topology, 379 



9?fn, 144 
66, 68, 92 
Gn, 65, 68, 70 
Gp,„, 92 
M„, 65, 68, 70 
Pp,n, 92 
Qp,n, 92 
92 

$n,70 
$p.n,92 
$n,70 
fin, 60, 91 
7„, 60, 61, 91 
31, 97 
a, 35 

31, 97 
35 

31, 97 
35 

i^\n, 105 
^n, 31, 97 
^t,35 

du{’, '), 365 
m{X), 221, 271 
m(4n), 97 
m(i), 221, 267 
m(x)®«, 267 
m(x)®9, 267 
Qp,n> 91 
M{E), 7 
M+{E), 7 
A<o(£^), 7 
Ent(.), 8 

(max,+)-semiring, 341 

Absorbed particle, 68, 72 
Absorbing condition, 446 
Absorbing medium, 22 
Absorption events, 71, 442 
Acceptance/rejection, 41 
Accessibility condition, 66, 67, 92 
Adaptive dynsunic, 41 
Adaptive stochastic search, 40 




Index 551 



Additive set functions, 379 
Ancestor, 33, 397 
Ancestral line, 33, 36, 105 
Angle bracket, 236, 307 
Approximation measures. 111 
Asymptotic stability, 122 
Auto-regressive model, 56 

Baker’s selection, 388, 424 
Bald! and Dembo-Zeitouni theo- 
rem, 351 

Ballistic events, 446, 450 
Bayesian 

prior and posterior, 21, 499 
Birth and death process, 33, 36, 
55, 446, 450, 457 
Boltzmann 

operator, 69 
rarefied gas models, 108 
Boltzmann entropy, 122 
Boltzmann-Gibbs 

asymptotic properties , 271 
distribution, 32 
transformation, 13, 60 
Bootstrap filters, 41 
Branching and interacting parti- 
cle systems, 41, 405 
Branching excursions model, 404 
Branching selections, 41, 388 
Buffer overflows, 430 

Canonical 
chain, 51 
space, 51 

Cemetery state, 68, 71, 447 
Central limit theorem, 291 

particle density profiles, 300, 
301 

path space models, 322 
triangular arrays, 291, 294, 
295 

Chain growth methods, 488 
Change of reference probability, 
63, 459, 499 
Chemical bonds, 484 



Coffin state, 68 
CoUiding molecules, 40 
Combinatorial transport equation, 
267 

Communication networks, 431, 501 
Compatibility condition, 107 
continuous time, 23 
discrete time, 76 
Condensation filters, 41, 429 
Conditional explorations, 400, 404, 
423, 492, 493 
Conditions 
(G), 115 
(/)m, 139 

{LU 182 

{MU, 116 

{QU, 139 

(A) , 67, 220 

(B) , 220 

116 

(M)**P, 116 
($), 472 
(G), 147 
{MU, 147 
{QU, 147 

{Ha), 135 

Connecting maps, 360 
Connective constant, 497 
Continuous mapping theorem, 299 
Contraction coefficients, 132, 138, 
472, 508 

Coordinate method, 50 
Covering numbers, 227 
Crimer technique, 333 
Creation and killing, 40 
Csiszar divergence, 123 
Cylinder set, 50 

Data assimilation, 21 
Dawson-Gartner 

projective methods, 333, 359 
Delta method, 291, 299 
Descendant genealogy, 397 
Directed polymer, 427 




552 Index 



Dirichlet problems, 54, 427, 430, 
431,440 

Disintegration, 396, 397 
DNA sequences, 428 
Dobrushin ergodic coefficient, 127 
Domb-Joyce model, 496 
Donsker’s theorem, 292, 318 
particle models, 318 
Dynkin-Mandelbaum theorem, 323, 
326 

Economical time series, 428 
Edwards’ model, 496 
Elementauy transition, 49 
Energy function, 78 
Ensemble Kalman filters, 41, 429 
Entropy integral, 228 
Evolutionary mathematics, 40 
Exchangeable measure, 262 
Excursion particles, 431 
Excursion-space models, 52 
Exploration, 71 
Exponential tightness, 349 
particle models, 351 
Extended Black-Scholes model, 503 
Extended Kalman-Bucy filter, 428 
Extinction Probabilities, 231 

Feynman-Kac measures 
annealed models, 83 
conditional, 396 
continuous time models, 11, 

12 

discrete time models, 11, 12, 
47 

distribution flows, 58, 68 
distribution space models, 85 
excursion-space models, 431 
normalizing constant, 12 
path space models, 34, 62, 110 
prediction models, 60, 88, 110 
quenched models, 83 
random medium models, 81 
time marginals, 34 
unnormalized models, 60, 110 



updated models, 60, 88, 110 
Feynman-Kac semigroups 

contraction properties, 132 
functional inequalities, 134, 137 
McKean models, 277 
oscillations, 133 
prediction models, 88 
stochastic models, 152 
updated models, 91 
weak regularity properties, 144 
Feynman-Kac-Metropolis models, 
164, 166 

Financial mathematics, 41 
Fluctuations, 113, 450 

Galton- Watson model, 26 
Gateaux differentiability, 351 
Genealogical tree, 25, 396, 450 
descendant and ancestral ge- 
nealogies, 396 

interacting particle models, 95 
models, 33, 36, 103 
Genetic 

algorithms, 41, 77 
particle model, 40 
population, 25 
Gibbs sampling, 41, 393 
Glivenko-Cantelli theorem, 241 
Global optimization, 41 
Global positioning system, 15, 428, 
500 

Go with the winner, 41, 102 

h-relative entropy, 122 

variational representation, 136 
Hahn-Jordan decomposition, 125 
Hamiltonian function, 26, 486 
Hausdorff topological space, 339 
Havrda-Charvat entropy, 122 
Hellinger integrals, 122 
Hidden Markov models, 504, 521 
Hilbert-Schmidt operator, 323 
Historical process, 52, 64, 103 

Idempotent analysis, 332, 340 




Index 553 



Idempotent probability measures, 
332 

Image processing, 41 
Importance sampling, 460, 465 
Inequalities 

Burkholder, 223 
Bernstein, 222 
Berry-Esseen, 292, 306 
martingale sequences, 306, 
309, 310 

particle models, 311 
Chemov-Hoeffding, 223 
Csiszar, 262 
Khinchine, 223 
Marcinkiewicz-Zygmund, 223 
Infinitesimal neighborhood, 49 
Integral operator, 9 
Interacting jump, 75 
Interacting Kalman-Bucy filters, 
518 

Interacting Metropolis models, 29, 
41, 389 

Interacting particle systems, 95, 
394 

Interacting process interpretation, 
73, 394 

Invariant measimes, 157, 472 

existence and imiqueness, 160 
lonescu-llilcea theorem, 459 
Ising model, 389 

Jump generator, 24 

Kakutani-Hellinger integrals, 122 
Kallitmpur-Striebel formula, 16 
Kalman-Bucy filters, 79 
Killing, 22, 443 

annealed properties, 198 
interpretation, 71 
transition, 71 

Kolmogorov-Smimov metric, 365 
Kushner-Stratonovitch equation, 
17 

Laplace- Varadhan lemma, 333, 352, 
354 



extended version, 354 
Large-deviation principles, 113, 331 
definition, 335 
lower boimd, 340 
McKean models, 337, 374 
upper bound, 339 
weak principles, 340 
Lebesgue decomposition, 123 
Legendre-Fenchel transformation, 

348 

Levy’s convergence theorem, 291 

Lifetime, 23 

Likelihood 

asymptotic properties, 510, 521 
functions, 504, 506 
ratio, 461 

Lindeberg condition, 297 
Logarithmic addition/multiplication, 
340 

Logarithmic moment-generating func- 
tion, 349 

Lower semicontinuity, 340 
Lyapunov exponent, 469 

Macromolecules, 484 
Markov chain, 48 

canonical model, 50 
excursion-spsice models, 52 
nonhomogeneous, 58 
path-space models, 51, 52 
stopped models, 52 
Markov kernel 
definition, 9 
operator, 9 
McKean 

interpretations, 75, 77, 394 
measures, 68, 74, 111 
models, 76 

Mean field particle process, 74 
Metropolis-Hastings models, 41, 164, 
488 

Micro-statistical mechanics, 23, 40 
Mixing conditions, 139 
Moment-generating function 
independent sequences, 224 




554 Index 



interacting models, 247 
Monomers, 484 

Multiple Hypothesis Testing al- 
gorithm, 518 

Multiple models estimation, 501 
Multiple Wiener integrals, 324 
Multisplitting method, 428, 429, 
439, 451, 463 
Mutation, 32, 98 
Myopic self-avoiding walks, 496 

Nanbu particle model, 108 
Natursd evolution models, 40 
Nonlinear filtering, 427, 497 
conditional distribution, 16 
definition, 15 

discrete time formulation, 17 
discrete time observations, 17 
observation process, 15 
partially linear/Gaussian, 513 
robust equation, 18 
signal process, 15 
speech separation, 20 
stability properties, 153, 508 
stochastic volatility, 20 
tracking problems, 19 

Obstacles, 68, 440 

hard and soft, 22, 72, 475 
repulsive, 73, 475 
Occupation measure, 33 
Ocean prediction, 428 
Offsprings, 39 

One-dimensional neutron model, 
149, 469 

One-step predictor, 154, 505 
Optimal control, 522 
Optimal filter, 154, 505 

Parabolic Anderson model, 48 
Particle approximation measure, 
109 

Particle filters, 41, 429, 512 
Particle genealogy, 103 
Particle Lyapunov exponents, 473 



Particle regulation, 522 
Particle simulation, 41 
Particles, 30 
Path particles. 111 
Path-space models, 51 
Perturbation sequence, 15 
Perturbation theory, 469 
Poisson problem, 55 
Polish space, 333 
Polymers, 40, 484 

degree of polymerization, 25, 
484 

directed polymers, 25 
intermolecular interaction, 25 
nonintersecting chtdns, 25 
simulation models, 487, 490 
solvent, 25 

Positive operators, 469 
Potential 

creation and killing, 23 
Prediction, 70 
Preordered set, 359 
Projective limit space, 360 
Projective limit topology, 359 
Projective spectrum, 360 
Propagation of chaos, 113 
entropy estimates, 259 
strong chaoticity, 257 
total variation estimates, 260 
weak chaoticity, 253 
Protein-folding problems, 388 
Prime enrichments, 41, 488, 495 

Quadratic characteristic, 307 
Quantum physics, 22 
Quenched Kalman-Bucy filters, 515 
Queueing model, 56 

Radar processing, 15, 497 
Raleigh-Ritz principle, 470 
Random excursion models, 429, 
448, 459, 489, 493 
Random medium, 81 
Rare events, 427, 430, 444, 463 
Ratio tests, 462 




Index 555 



Reconfiguration, 41 
Regular topological space, 339 
Reinforced random walks, 487, 490 
Rejuvenation, 41 
Relaxation time, 144, 152, 191 
Remainder stochastic sampling, 388 
Reptilian algorithms, 487 
Repulsive/attractive interaction, 485, 
488 

Resampling, 41 
Restart method, 41, 431 
Restricted Markov chains, 421 
Ruin process, 446, 455 

Sampling-importance-resampling, 

41, 429 

Sanov theorem, 363, 373 
Satellite constellation, 501 
Schrodinger 

equations, 23 
operator, 23, 24 
top eigenvalue/vector, 24 
semigroups, 427, 469 
Selection, 32, 98 

Self-avoiding random walks, 388, 
489, 493, 495 

Sequential Monte Carlo methods, 

37, 421, 429 

Shannon-Kullback information, 122 
ShigarTanaka formula, 323 
Simple random walk, 55 
Skorohod theorem, 299 
Slithering tortoise algorithms, 487 
Slutsky’s technique, 291, 298 
Spawning, 41 

Spectral analysis, 469, 477 
Spectral radius, 470, 477 
Statistical hypothesis, 462, 502 
Stein lemma, 307 
Stein’s technique, 307 
Stochastic linearization, 39 
Stochastic volatility, 503 
Stopped process, 12 
Storage and dam model, 55 
Strong contraction estimates, 142 



Strong law of large numbers, 231 
Sub-Markov property, 68 
Switching, 41 
Switching models, 502 

Telecommunication analysis, 41 
Time uniform estimate, 244 
Toeplitz-Kronecker lenuna, 194 
Topological space, 339 
Topology, 339 

Toted variation distance, 124 
IVace class operator, 324 
Tracking problems, 428, 497, 500 
Transport problem, 103 
Trapping analysis, 22 
Trapping interpretation, 68 
Tychonoff’s theorem, 360 
Type I/II errors, 463 

unbiased estimate, 112 
Updating, 70 

Upper semi-continuity, 340 
Upper semicontinuity, 340 
Urn model, 56 

Vague topology, 335 
Variational entropy formula, 366 

Weak topology, 335 
Weak-* topology, 336 
Weighted bootstrap, 407 

Zolotarev seminorm, 227 





ALSO AVAILABLE FROM SPRINGER! 



Aa iBtrodMCtiM to 
tJic TtMorr «f 
P»faif PiveeMn 



Uoilf iMorau 
for Pwwtowly 
9top9«4 Sloc^aKk 

ProQ^rie* 



AN INTRODUCTION TO RARE 
EVENT SIMULATION 

JAMES A. BumiW 

This book presents a unified theory of rare event 
simulation and the variance reduction technique 
known as importance sampling from the point of 
view of the probabilistic theory of large devia- 
tions. This perspective allows us to view a vast 
assortment of simulation problems from a uni- 
fied single perspective. 

This text keeps the mathematical preliminaries 
to a minimum with the only prerequisite being a 
single large deviation theory result that is given 
and proved in the text. It concentrates on demon- 
strating the methodology and the principal ideas 
in a fairly simple setting. It includes detailed sim- 
ulation case studies covering a wide variety of 
application areas including statistics, telecom- 
munications. and queueing systems. 

2004/270 PP./HASOCOVEft/lSeN 0^387 2007^9 
SPRINGER SERJES IN STATISTICS 

LIMIT THEOREMS FOR 
RANDOMLY STOPPED 
STOCHASTIC PROCESSES 

O.s. SILVESTROV 

This volume is the first to present a state-of-the- 
an overview of this field, with many of the results 
published for the fust time. It covers the general 
conditions as well as the basic appbeations of the 
theory, and it covers the vast and technically 
demanding Russian literature in detail, A survey 
of the literature and an extended bibliography of 
works in the area are also j^ovided. The coverage 
is thorough, streamlined and arranged according 
to difficulty for use as an upper-level text, 

2004/416 PP./HARtXOVER/ISBN 1^233^777-X 
PROSABLfLTY AND ITS APPU CATIONS 



AN INTRODUCTION TO THE 
THEORY OF POINT PROCESSES 

Volume I: Elementary Theory 
and Methods 

Second Edition 

DARYl J. DALEY and DAVtD VERE-JONES 

Point processes and random measures find wide 
applicability in telecommunications, earthquakes, 
image analysis, spatial point patterns, and stere- 
ology, to name but a few areas. The authors have 
made a major reshaping of their work in their first 
edition of 1988 and now present their Introduction 
to the Theory of Point Processes in two volumes. 
Volume One contains the introductory chapters 

from the first edirion, together with an informal 
treatment of some of the later material intended 
to make it more accessible to readers primarily 
interested in models and apphcations. The main 
new material in this volume relates to marked point 
processes and to processes evolving in time, 
where the conditional intensity methodology 
provides a basis for model building, inference, 
and prediction. 

2003/464 PP./ HARDCOVER/ ISBN 03B7 95541^ 
PROBABULTY AND ITS APPLICATIONS 

To Order or for Infonnatioii: 

in the AmerKaa mil leOOSPHINGER 
FAX: 1 201) 348.4305 - MITI: ScringAr-V«f1|« 

YdtV. inc . SS6%>P0 6oi 2485. Secaucus. NJ 

070902485 < VISIT: toeal t4chnic:«t nooitsLoi^ 

• IMMb ?r4ef«eitpririg«f^.C4xn 

(Xit^ tmAf7Kfkas:mi: 8221 34S217/a 

• FAX; ^ 49 [Oj 6221 345229 - RMfTE! Sprtr^^ 
Cuslwnef Servic«. HjtMKStriKSS 7. 09126 HeKSeRwg. 
Germany • tMAi: CT Cie ri O a qrtF ig af <Se 

pffoMorwv^ sse.'jft 



Springer 



www.spnrkgef’ny.coin 





